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ABSTRACT 


This four-part enquiry treats selected theoretical and empiri- 
cal developments in the Prisoner’s Dilemma. The enquiry is oriented 
within the sphere of game-theoretic conflict research, and addresses 
methodological and philosophical problems embedded in the model under 
consideration. 

In Part One, relevant taxonomic criteria of the von Neumann- 
Morgenstern theory of games are reviewed, and controversies associ- 
ated with both the ‘utility function and game~—theoretic rationality 
are introduced. In Part Two, salient contributions by Rapoport and 
others to the Prisoner's Dilemma are enlisted to illustrate the 
model's conceptual richness and problematic wealth. Conflicting 
principles of choice, divergent concepts of rational choice, and 
attempted resolutions of the dilemma are evaluated in the static 
mode. In Part Three, empirical interaction among strategies is 
examined in the iterated mode. Å computer-simulated tournament of 
competing families of strategies is conducted, as both a complement 
to and continuation of Axelrod's previous tournaments. Combinatoric 
sub-tournaments are exhaustively analyzed, and an eliminatory ecolog- 
ical scenario is generated. In Part Four, the performance of the 
maximization family of strategies is subjected to deeper analysis, 
which reveals critical strengths and weaknesses latent in its dec- 
ision-making process. 

On the whole, an inter-modal continuity obtains, which suggests 
that the maximization of expected utility, weighted toward probabil- 
istic co-operation, is a relatively effective strategic embodiment of 
Rapoport's ethic of collective rationality. 
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INTRODUCTION 


John von Neumann, who made several outstanding contributions to 
scientific endeavour, founded—together with Oskar Morgenstern—an 
entirely new branch of mathematics.) Their formal Theory of Games was 
developed between 1928-1943 and, in the five decades since its first 
appearance, it has been adapted, applied and extended toa broad 
range of philosophical, mathematical, and social scientific inter— 
ests. This enquiry addresses itself to one of the formative problems 
that emerged from the theory of games; namely, the Prisoner's Dilem- 
ma. 

This problem itself has developed into a panoply of multi- 
disciplinary concerns, to the extent that it would require no mean 
feat of research even to classify the existing body of literature on 
the subject. Anatol Rapoport presented a graph of the number of 
scholarly papers on the Prisoner's Dilemma for each year of the 
decade 1960—69. He found 28 papers in 1960, and a peak of 100 papers 
in 1967..? In the mid-seventies, Shubik listed an eclectic bibliog- 
raphy containing hundreds of scholarly articles on the Prisoner's 
Dilemma.’ For that whole decade (1970-79), Axelrod counted more than 
350 citations on the Prisoner's Dilemma in Psychological Abstracts 
alone, which prompted his remark “The iterated Prisoner's Dilemma has 
become the E. coli of social psychology" .* 

The substantial and growing body of literature on the subject 
extant serves notice that the Prisoner's Dilemma is a model rich in 
implications and ramifications, both theoretical and empirical, to 


and for researchers in many disciplines. This enquiry examines the 


HT von Neumann & O. Morgenstern (1944), Theory of Games and 
Economic Behaviour, John Wiley & Sons Inc., N.Y., sixth edition, 
1955. 


2 A. Rapoport (ed.), Game Theory as a Theory of Conflict Resolu- 
tion, D. Reidel Publishing Co., Dordrecht, 1974, p.20. 


3 M. Shubik, The Uses and Methods of Gaming, Elsevier Scientific 
Publishing Company, N.Y., 1975. 


4 R. Axelrod, “Effective Choice in the Prisoner's Dilemma’, 
Journal of Conflict Resolution, 24, 1980a, pp.3-25. 
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two-person Prisoner's Dilemma within the sphere of dgame-theoretic 
conflict research. Through a consideraton of methodological and 
philosophical problems embedded in the model, the enquiry studies 
conditional. resolutions in the static mode, and develops a parametric 
approach to strategic robustness in the iterated mode. 

The enquiry consists of four principal parts. Parts One and Two 
are theoretical in nature; Parts Three and Four, empirical. 

Part One recapitulates certain fundamental precepts of and 
difficulties latent inthe theory of games, in so far as these 
pertain to the "classical" formulation of the Prisoner's Dilemma. A 
suitable frame of reference and appropriate terminology are thereby 
introduced, which in turn allow the problem itself to be set out both 
succinctly and unambiguously. 

Part Two examines the static case of the Prisoner's Dilemma, 
and elucidates the fundamental conflict between two principles of 
choice (dominance versus maximization of expected utility). Two 
proposed "resolutions" of this conflict are considered: a decision- 
theoretic reformulation of Newcomb's paradox, and a stable meta-game- 
theoretic matrix, both of which favour mutual co-operation as a 
result of ‘the maximization of expected utility. However, an argument 
is rehearsed which asserts that, notwithstanding the validity of 
these resolutions, the dilemma persists nonetheless. 

Part Three examines the iterated case of the Prisoner's Dilem- 
ma, in which static principles of choice are replaced by dynamic 
strategies. The cogent outcomes of Axelrod's two computer—conducted 
tournaments are summarized,” and the results of a third tournament 
are analyzed and discussed in some depth. This third tournament 
(inspired by Axelrod's former two) features competition not only 
among individual strategies, but also among "families" of related 
strategies. In the computer-simulated environment of the third 
tournament, the family of strategies which maximizes expected utility 
proves relatively effective. But (in similarity to the static case) 


5 Axelrod, 1980a, & idem., “More Effective Choice in the Prison- 
er's Dilemma', Journal of Conflict Resolution, 24, 1980b, pp.379-403. 
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it is argued that no single strategy (or family of strategies) can 
claim absolute superiority in the iterated Prisoner's Dilemma. 

Part Four examines the performance characteristics of the 
maximization family under a higher power of analytical resolution. 
The examination reveals some interesting and unexpected properties of 
this strategic family, and subsequent analysis is devoted to an 
account of how and why these properties emerge. The enquiry's pers- 
pective and main findings are then summarized, and some pertinent 
conclusions are drawn. 

The Appendices offer the following supplementary information 
and/or data. 

Appendix One provides a glossary of strategic families, acro- 
nyms and summarized decision rules, intended for rapid reference. 

Appendix Two gives the complete table of raw scores for the 
main tournament involving twenty strategies. Each strategy competes 
against the others, and against its twin. A 20 x 20 matrix of raw 
scores results. 

Appendix Three contains efficiency tables for the combinatoric 
sub-tournaments, which are employed in the evaluation of strategic 
robustness. (The generation and usage of this data are explained in 
Chapter Eight.) 

Appendix Four affords documented samples of the computer 
programs used in the experiment and in subsequent data analysis. Ten 
tournament programs are listed, each of which simulates a competition 
between two different strategies. Thus each of the twenty strategic 
algorithms appears once in sample form. The main analytical programs, 
and some relevant supplementary routines, are also listed. 

To a large extent, this study is inspired and motivated by 
invaluable works of Professors Anatol Rapoport and Robert Axelrod 
(among other game-theorists). Its intent is both to develop a context 
which permits juxtaposition of their significant contributions, and 
also to contribute a modest sum of findings to the great wealth of 
their tradition. 


PART ONE: 
GAME-THEORETIC BACKGROUND 
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_ Chapter One 


While the theory of games embraces concepts subject to diver- 
gent interpretation (such as rationality and utility), there is 
little dispute over the theory's ability to classify games effective— 
ly. The useful taxonomic criteria set out by von Neumann and Morgen- 
stern have been adopted, virtually without dissent, as the definien- 
tia to date. 

In game-theoretic terms, then, the basic Prisoner's Dilemma is 
classified as a two-person, non-zero-sum, non-co-operative game. À 
brief clarification of this terminology may serve to explain not only 
what kind of game the Prisoner's Dilemma is (and is not), but also 
why it holds such fascination for game theorists of many stripes. 

Most generally, a degree of knowledge about any game is con- 
ferred by the very act of classifying it (or examining its prior 
classification, as the case may be). Just as fundamental properties 
of an element are revealed by its position in the periodic table, and 
Similarly as common properties of flora and fauna are attributed by 
Linnaean nomenclature, so are the important properties of games 
spelled out by the respective method at hand. But a deeper purpose 
resides in the classification of games, in addition to their logical 
ordering as conceptual objects: once a game is correctly classified, 
one knows whether the theory is prescriptive, or merely descriptive, 
of its play. Thus the taxonomic structure of game theory allows the 
identification of those constituents over which the theory has 
normative power, and therein lies its usefulness. Examples will be 
cited to illustrate this point. 

To begin with, however, one may justly ask: what is meant bya 
game? In reply, it seems reasonable to quote the authors of the 
theory : 


"The game is simply the totality of the rules which 
describe it. Every particular instance at which the game 
is played—in a particular way—from beginning to end, is 
a play. The game consists of a series of moves, and the 
play of a sequence of choices." 


l Neumann & Morgenstern, 1955, p.49. 
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Viewed in this way, virtually any activity or pursuit can be treated 
as a game, so long as it can be defined or otherwise described by 
some set of rules. 

Thus the theory of games is not restricted to pastimes of the 
“parlour game" variety. It can be applied to a range of competitions, 
conflicts of interest, and situations of decision-making under risk. 
Most generally, then, from a game-theoretic perspective, bridge can 
be viewed as a game of cards defined by the rules of Hoyle; roulette, 
a game of chance governed by the rules of probability: mathematics, a 
game of symbolic association developed according to the rules of 
consistency; boxing,agame of pugilism ritualized by the rules of 
Queensberry; driving a motor vehicle, a game of transportation 
described by the rules of the road; banking, a game of monetary 
transaction affected by the rules of economy; running for public 
office, a game of politics influenced by the rules of expediency; 
diplomacy, a game of international relations mediated by the rules of 
policy. 

The theory of games can scarcely be termed modest, at least in 
taxonomic scope. It can classify a staggering range of activities 
according to an elegant but limited set of criteria which are quan- 
titative and/or Boolean in character, and which do not take into 
account the correspondingly broad set of qualitative purposes that 
may underlie such activities, from diversion to stimulation, from 
profit to ambition, from savagery to statesmanship. The theory, 
however, pays a fair price for its universality: although it can 
classify a great number of activities, its normative power turns out 
to be quite constrained. The theory thus describes the play of many 
games, but prescribes the play for relatively few. 

Specifically, the principal taxonomic criteria that pertain to 
the Prisoner's Dilemma can be described as follows: 

(1) Number of Players 

In general, a game can be played by M persons, where M> 1 (is 
greater than or equal to unity). 

Single-person games, with one player, take place against some 
state of nature, be it organic or synthetic. A solitary card game, 
for example, is played against a given state of the deck. 
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Two-person games form the core of game theory, whose axioms, 
postulates and theorems are extended, where possible, from the two- 
player case to cases involving more than two players. 

By convention, "N-person" games refers to games involving three 
or more players. Not surprisingly, the complexity of game-theoretic 
analysis tends to increase as a function of the number of players. 
(The situation is loosely analogous to dynamical problems in physics 
involving two bodies, three bodies, and many bodies.) Mperson 
Prisoner's Dilemmas lie beyond the scope of this study, which con- 
fines itself to the two-person game. However, multiple pairs are 
involved in the iterated mode, where the situation is analogous to a 
chess tournament. (Chess remains a two-person game, although multiple 
pairs of players can compete in iterated competitions.) 

(ii) Constancy or Non-Constancy of Sum 

With each game is associated a set of payoffs. These are the 
gains or forfeitures of each player, which result from the play. A 
constant-sum game is a game in which the algebraic sum of payoffs is 
constant. The constant itself may be less than zero, zero, or greater 
than zero. In tournament chess, for example, the winning player 
receives one point; the losing player, zero points; and in the event 
of a stalemate or draw, each player receives a half-point. Tournament 
chess is thus a constant-sum game whose sum equals unity. 

A zero-sum game forms a special class of constant-sum games, in 
which the algebraic sum of payoffs equals zero. In poker, for in- 
stance, the total sum of monies (or matchsticks) won by the winning 
players equals the total sum of monies (or matchsticks) lost by the 
losing players. This remains vacuously true if all players "break 
even"; i.e. if no-one wins or loses. Poker is thus a zero-sum game. 

A non-constant-sum game is a game whose sum of payoffs is not 
constant. In cribbage, for instance, each player accumulates points 
until one and only one player wins by surpassing one hundred and 
twenty points. The algebraic sum of all players’ points is non-zero, 
and can assume a range of values up to and including 121 + 120(N-1), 
for an N-player game. Cribbage is thus a nom-constant-sum game, with 
respect to points scored. 
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Every game is either constant-sum and non-zero—sum, zero-sum, 
or non-constant-sum, with respect to a particular set of payoffs. A 
game may have more than one set of payoffs. War, for instance, is a 
negative non-constant-sum game with respect to lives lost in battle; 
it is also a zero-sum game with respect to territory that changes 
hands as a result of battle. 

It should be noted that any constant-sum game can be repre- 
sented as a zero-sum game (by means of adjusting the payoffs in its 
matrix) K Consequently, "Every constant-sum. game is strategically 
equivalent to a zero-sum game." In the broadest sense, then, it is 
most convenient to refer to a game as either zero-sum or non-zero- 
sum. 

(iii) Co-operation 

A game is said to be co-operative (or negotiable) if the 
players can communicate their respective intentions prior to a move, 
or agree upon co-ordinated strategies, and thereby influence the 
play. Arbitration, negotiation, collusion, and conciliation, among 
other processes, reflect possible aspects of co-operation. The sphere 
of economics, for instance, admits of a host of co-operative games," 
as do numerous social interactions in daily life. 

A game is said to be non-cc-operative (or non-negotiable) if 
"absolutely no preplay communication is permitted between the play- 
ers"? Conceivably, the rapidity or automation of play itself can 
weigh heavily against co-operation, if such play outpaces the speed 


ê A proof can be found in R. Jeffrey, The Logic of Decision, 
McGraw-Hill Book Company, New York, 1965, pp.14-30. 


; Neumann & Morgenstern, 1955, p.348. 


å Å pioneer of negotiable games is Nash. E.g. see J. Nash, ~The 
Bargaining Problem', Econometrica, 18, 1950, pp.155-162. For a 
perspective on negotiated games, see e.g. A. Rapoport, Two-Person 
Game Theory, The University of Michigan Press, Ann Arbor, 1966, 
pp.94-122. For a study of co-operative games in terms of economic 
cybernetics, see e.g. Vorob'ev, N., Game Theory, Lectures for Econom- 
ists and Systems Scientists, s.v. S. Kotz, Springer-Verlag, N.Y., 
1977. 


> D. Luce & H. Raiffa, Games and Decisions, John Wiley & Sons 
Inc., N.Y., 1957, p.89. 
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of communication. To cite a most drastic example, the Cold War has 
been non-co-operative in the sense that a nuclear war could be trig- 
gered accidentally, without adequate time for human intervention.’ 
The installation of the so-called "Hot Line" between Washington and 
Moscow represented an early attempt, in game-theoretic terms, to 
offset cybernetic non-co-operativeness by introducing an element of 
human communication at the highest echelon of decision-making. 

In general, a game may be co-operative, non-co-operative, or 
partly co-operative, with respect to the players' choices and their 
respective payoffs. 

(iv) Strictness of Determination 

A zero-sum game is said to be strictly determined if and only 
if a saddle point exists in its normal matrix representation. This 
property proceeds from the fundamental theorem of the functional 
calculus of two-person zero-sum games, which states the necessary and 
sufficient condition for the existence of a saddle point. From 
subsequent commentary in game-theoretic literature, it is evident 
that two foci of contention (utility and rationality) originate from 
the postulates leading to the statement of this theorem. An outline 
of the theorem follows. 

Å game matrix is constructed according to the following conven- 
tion: suppose two players, Å and B, have respective choices 


{a ,a...,a,} and {bh B,....B} 


for a given move in a zero-sum game. Then an mn-by-m matrix of mutual 
choices obtains: 


6 E.g. see A. Grinyer & P. Smoker, "It Couldn't Happen - Could 
It? An Assessment of the Probability of Accidental Nuclear War”. 
Richardson Institute for Conflict and Peace Research, University of 
Lancaster, 1986; and D. Frei, Risks of Unintentional Nuclear War, 
Published in Cooperation with the United Nations Institute for 
Disarmament Research, Allanheld, Osmun & Co. Inc., Totowa, N.J., 
1983. 
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Game 1.1 — Generalized Matrix of Choices 


(ab) (2,0 ++. (8.2) 
O CD GD EEE 
(a,b (8.6 o... (ah) 


Every possible joint choice of the two players—and thus every 
hypothetical game-state for that move—is uniquely represented by 
some entry in the matrix. But Game 1.1 is unplayable, since the 
players can neither express preference among possible choices, nor 
implement principles of choice, without first knowing the payoffs for 
each possible outcome of their joint choosing. Once the payoffs are 
stipulated, they must be value-ordered according to the preferences 
of the players. And so arises the necessity of transforming each 
outcome (or payoff) into its respective value to each player. 

For the time being, let the existence of such a transformation 
be assumed. Von Neumann and Morgenstern call it &, the "utility 


function" 7 


The function is mathematically acceptable, but game- 
theoretically controversial. It maps the preference for each game- 
state into the utility of that game-state, U, to each player. The 
utility itself is a real number. So, for the xth choice of player 4, 


and the yth choice of player B, 
Uy = $(4,b,) 

For convenience, let (a, »B) be written as simply as (x,y). Then 
Uy = 6(x,y) 


where, by convention, Uy is the utility of the joint play (x,y) to 


i Neumann & Morgenstern, 1955, pp.88-123. 
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Player A. Since the game in question is zero-sum, the utility of the 
joint play (x,y) to player B is simply -Uy Applying & to all (x,y) 
in Game 1.1 results in a playable game: 


Game 1.2 — Generalized Matrix of Utilities 


B 
(1,1) (1,2) ... öm) 
(2,1) MALD ... 8(2,m) 
A E | j 
dn) a2) ... nm) 


By virtue of the utility function, the players can assess the 
values of all possible outcomes for that move, and each player can 
then exercise his individual preference accordingly. , 

At this juncture, von Neumann and Morgenstern introduce the Max 
and Min operators * Maxd(x,y) is the maximum value of (x,y) in 
column y, and Min S(x, y is the minimum value of (x,y) in row x. 
Then Max, Max’ (x, y) is the maximum of column maxima; Min, Mirp(x, y), 
the minimum of row minima. 

It can be shown that the operators (Max, , Max] and [Min Min] 
commute. In other words, the maximum of column maxima is congruent 
with the maximum of row maxima, and the minimum of row minima is 
congruent with the minimum of column minima. But there is no general- 
ization as to the commutativity or non-commutativity of (Max, + Min] ; 
Two examples illustrate the point: 


Game 1.3 — Å Case in Which All Operators Commute 
B 


Iel 232. 33 
A 4,4 5,5 6,-6 
77. BB 9 


In Game 1.3, with respect to player A, the column maxima are 
seven, eight and nine respectively; the maximum of column maxima is 


Ibid. 
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therefore nine. The row maxima are three, six and nine respectively; 
the maximum of row maxima is therefore nine. Similarly, the minimum 
of row minima is congruent with the minimum of column minima (at 
one). 

And in this case, the minimum of column maxima happens to be 
congruent with the maximum of row minima (at seven). But consider 
another case: 


Game 1.4 — A Case in Which Not All Operators Commute 
B 


9,9 2,-2 3,-3 
A 4,4 5,5 6,-6 
dad 858. LI 


In Game 1.4, again with respect to player Å, the maximum of 
column maxima is congruent with the maximum of row maxima (at nine); 
and the minimum of row minima is congruent with the minimum of column 
minima (at one). 

But in this case, the minimum of column maxima is six, whereas 
the maximum of row minima is four. The two are not congruent. 

Although Games 1.3 and 1.4 have been viewed from the perspec- 
tive of player A, the assertions concerning column and row operators 
are symmetrically consistent for player 8. If viewed from player B's 
perspective, what was a column (to player A) is now a row, and vice- 
versa. 

Mutatis mutandis, for player Bin Game 1.3, the minimm of 
column minima is congruent with the minimum of row minima (at minus 
nine); the maximum of column maxima is congruent with the maximum of 
row maxima (at minus one); and the minimum of column maxima happens 
to be congruent with the maximum of row minima (at minus seven). 

Similarly, for player B in Game 1.4, congruency obtains for the 
minima of column and row minima (at minus nine), and for the maxima 
of row and column maxima (at minus one). However, the minimum of 
column maxima (at minus four) is not congruent with the maximum of 
row minima (at minus six). 
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Thus the operators are symmetrically consistent for both 
players. 

Once it has been established that like operators always com 
mute, while unlike operators do not always commute, then a saddle 
point can be defined: 

"Let (x,y) be any two-variable function. Then (x,% is 

a saddle point of & if at the same time (Xn) assumes 

its maximum at x=x% and 9(%y) assumes its minimum at 

FH a 

Now, the fundamental theorem under consideration states that 
Max, Mid (x, y) = Min, Max? (x, y) if, and only if, there exists a saddle 
point (3%.x). For a given game, there is no a priori guarantee of 
the existence of such a point. But if a saddle point exists, then 
that game is said to be strictly determined. 

The property of strict determination is crucial both to the 
rationalization of a game, and to its play. Games which have this 
property are rationalized, and played, ina critically different way 
from those which lack it. To appreciate the difference in play, let 
games 1.3 and 1.4 be set out side-by-side, am re-interpreted (as 
games 1.5 and 1.6 respectively) in terms of this property. 

In Game 1.5, Player A stands to gain no matter which outcome 
obtains. If A chooses the first row, he can gain no less than one; if 
the second row, no less than four; if the third row, no less than 
seven. A payoff of seven is the best of the worst possible outcomes 
for player A. Thus A should choose the row containing this payoff, 
Since such a choice would maximize his minimum gain (hence the term 
“maximin"). 


* Ibid., p.95. 
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Game 1.5 — A Saddle Point Game 1.6 — No Saddle Point 
a ÆR B B 
hel. 9 “S329 gg- DD 3 28 
A 4-4 5.5 6,-6 A 44 5,5 6,-6 
7,-7 8,8 9,-9 7,-7 8-8 1,-1 


Similarly, in Game 1.5, player B stands to forfeit no matter 
which outcome obtains. If B chooses the first column, he can forfeit 
no more than seven; if the second column, no more than eight; if the 
third column, no more than nine. A payoff of minus seven is the best 
of the worst possible outcomes for player B. Thus B should choose the 
column containing this payoff, since such a choice would minimize his 
maximum forfeiture (hence the term "minimax"). 

Clearly, the existence of a saddle point at (7,-7) is prescrip- 
tive to both players. Each player fares best in choosing his maximin 
(or minimax, respectively), regardless of the other player's choice. 

In Game 1.6, however, no saddle point exists. Player A's 
maximin is four; player B's minimax is minus six. 

Knowing this, Player A might reason "B should choose the third 
column, Since it contains his minimax. Therefore I should choose the 
second row, in order to gain six.” 

Knowing that, player B might reason "If A chooses the second 
row, then I should choose the first column, in order to forfeit only 
four." l 

Knowing this, player A might reason "If B chooses the first 
column, then I should choose the first row, in order to gain nine." 

Knowing that, player B might reason "If A chooses the first 
row, then I should choose the second column, in order to forfeit only 
two." 

Knowing this, player A might reason "If B chooses the second 
column, then I should choose the third row, in order to gain eight." 

Knowing that, player Bmight reason "If A chooses the third 
row, then I should choose the third column, in order to forfeit only 
one." 
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Knowing this, player Amight reason "If B chooses the third 
column, then I should choose the second row, in order to gain six." 

Thus the players find themselves in a strategic infinite 
regress." Clearly, in games without a saddle point, each player does 
not have a choice that is unconditionally "best", independent of what 
the other player chooses. 

The difference between the play of zero-sum games with and 
without saddle points is readily appreciable. Even so, a questionable 
assumption was unavoidably smuggled into the argument; namely, that 
both players wish to maximize their gains and minimize their losses, 
respectively. If one assumes, for the time being, that to be "ratio- 
nal" is to play maximin or minimax (if the game has a saddle point), 
then an important feature of strictly determined games comes to 
light. 

In Game 1.5, suppose that player A is rational, and player Bis 
irrational. Then, according to the assumption about rationality, A 
would choose the third row, which contains his maximin. But player B, 
being irrational, would not choose the first column, which contains 
his minimax. In that case, A would gain eight or nine (as opposed to 
seven), and B would forfeit eight or nine (as opposed to seven). 
Generally stated, it amounts to this: ina strictly determined game, 
a rational player can fare no better than by playing maximin (or 
minimax) if his opponent is rational; and can fare no worse than by 
playing maximin (or minimax) if his opponent is irrational. 

Von Neumann and Morgenstern bring the point home: for a ration- 
al player ina strictly determined game, it makes no difference 
whether his opponent is rational or irrational, and thus "the ration- 
ality of the opponent can be assumed, because the irrationality of 
his opponent can never harm a [rational] player.” 

Once again, this conclusion is based upon a prior—and not 
necessarily justifiable—assumption about the meaning of rationality. 


50, one can assume the rationality of an opponent only in strictly 


1 The same point was made, using a different example, in A. 
Rapoport & A. Chammah, Prisoner's Dilemma, University of Michigan 
Press, Ann Arbor, 1965, p.23. 


1 Neumann & Morgenstern, 1955, p.128. 
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determined zero-sum games, and only if one has previously assumed a 
possible meaning of rationality itself. But in the universe of games, 
relatively few are strictly determined; so the prescriptive aspect of 
the theory wields absolute power over a fairly thinly-populated 
realm. 

It would seem that the two-person, zero-sum game with a saddle 
point constitutes "the limit of applicability of game theory as a 
normative (or prescriptive) theory. "2 Nevertheless, as a descriptive 
theory, its power of classification appears virtually limitless. 
Although the theory is not prescriptive for the majority of games, 
it remains a triumph in taxonomy, and conduces to a better under- 
standing of those games which it can only describe. 

Despite the non-zero-sum status of the Prisoner's Dilemma, the 
saddle-point criterion can exert an inimical influence upon its play. 
Since the Prisoner's Dilemma is a two-person, non-zero-sum game, the 
theory cannot prescribe an unconditional resolution." 
does describe a multitude of conditional resolutions. And therein 
lies the dilemma's appeal, which devolves about elucidating varie- 
gated conditions under which resolutions can be achieved. Before 
examining the Prisoner's Dilemma, one must complete the game-theor- 
etic background sketch, by addressing two questionable assumptions 
made in the development of the taxonomy; namely, the existence of the 
utility function, and the meaning of rationality. 


However, it 


2 Rapoport & Chammah, 1965, p.23. 


13 Von Neumann and Morgenstern showed that any N-person, non- 
zero-sum game can be re-interpreted as an (M1)-person, zero-sum 
game. However, if the two-person Prisoner's Dilemma were re-inter- 
preted as a three-person, zero-sum game, novel problems of coalition 
formation would arise. E.g see A. Rapoport, Fights, Games, and 
Debates, The University of Michigan Press, Ann Arbor, 1960, pp.195- 
196. 
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Chapter Two 
The Utility Function 


The utility function can be regarded as both necessary to yet 
insufficient for the theory of games. In the absence of a utility 
measure, the players cannot value-order payoffs of different game— 
states (or outcomes) contingent upon their (the players’) possible 
choices; in the absence of value-ordering, the players cannot express 
their preferences; and in the absence of expression of preference, no 
moves are made, and the game cannot be played. In the presence of a 
utility measure, however, game theory inherits problems already 
embedded, at the axiomatic level, in utility theory itself. As Luce 
and Raiffa point out: 


. utility theory is not a part of game theory. It is 
true that it was created as a pillar for game theory, but 
it can , stand apart and has applicability in other con- 
texts." 


And while the edifice of game theory does not lack support from said 
pillar, its architecture is definitely constrained by weaknesses in 
the nature of the support. 

The two chief assumptions in von Neumann's and Morgenstern's 
utility theory are well-summarized, by Luce and Raiffa, as follows: 


(1) "That, given two alternatives, a person either 
prefers one to the other or is indifferent between them." 
(2) "That there are certain well-defined chance events 
having probabilities attached to them which are manipu- 
lated according to the rules of probability calculus.” 


But, as Luce and Raiffa indicate, both assumptions are subject to 
criticism.” Critical examples follow. 

For the first assumption, it is understood that the utility 
function, U, quantifies the preferences of the players. To accomplish 
this, the utility function must have two minimally necessary proper- 
ties: transitivity, and linear transformability. 


I Luce & Raiffa, 1957, p.12. 


2 Ibid, pp.371-373. 


3 Ipid. 
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(i) Transitivity: .given any outcomes v amd w, if player P 
prefers v to w, then Uv) > Ww). In other words, if P prefers an 
apple to an orange, then the utility of an apple must be greater than 
the utility.of an orange, with respect to player P. 

(ii) Linear transformability: if the probabilities that v and w 
obtain are pand (1-p) respectively, then U[p(v) + (1-p)(w)] = pU(v) 
+ (1-p)Ulw). In other words, the utility of the sum of the respective 
probabilities that P obtains an apple, and an orange, is equal to the 
sum of the products of the probability that each fruit obtains and 
the utility of that fruit, with respect to player P. 

But the utility-theoretic assumption (1), concerning prefer- 
ence, breaks down in the following example: suppose P prefers an 
apple to an orange, an orange to a pear, a pear to a banana, and a 
banana to an apple. Let these preferences be represented by v, w, x 
and y respectively. By the property of transitivity, U(v) > U(w), 
Utw) > U(x), Ux) > Uy), and Uy) > U(v). Now suppose P is offered a 
choice between either an apple and an orange, or a pear and a banana. 
In this case the utility function cannot value-order P's preferences, 
since it cannot determine whether [U(v) + U(w)] is greater than, less 
than, or equal to (U(x) + U(y)1.' 

This breakdown stems from the circularity of P's preferences 
which, though conceivable, is not orderable by the relation of 
transitivity. The problem belongs to the same class as Arrow's 
“voter's paradox" .’ 

Another breakdown of the assumption of preference occurs in the 
next example. Empirically, 


. .it was found that certain people preferred any bet 
in which they obtained one of two amounts of money with 
probability 1/2, to a bet in which the probabilities are 


i Neumann & Morgenstern were well aware of this shortcoming. 
They termed it "the relationship of incomparability"; 1955, p.630. 


> If P prefers candidate X > Y > Z, P, prefers Z > X > Y, and P, 
prefers Y y Z > X, then two of three people prefer X > Y, two of 
three prefer Y > Z, and two of three prefer Z > X. see K. Arrow, 
Social Choice and Individual Values, Yale University Press, New 
Haven, 1970, p.33. 
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1/4 and 3/4, providing the average value obtained was the 

sane. 

But modelling this result with the utility function leads swiftly to 
a contradiction. 

Let the utility of £x = U(x). Now consider these initial 
wagers: 

Bet #1: p(£150) = p(£100) = 1/2; average gain of £125 

Bet #2: p(£200) = 1/4, p(£100) = 3/4; average gain of £125 
Since the first wager was found to be empirically preferable, 

(1/2) U(100) + (1/2)U(150) > (1/4) U(200) + (3/4) U(100) 
or 20(150) > U(100) + U(200) (1) 
If the amounts in Bet #1 are changed to £50 and £200, while those in 
Bet #2 are changed to £50 and £150, then 

(1/2) U(50) + (1/2) U(200) > (1/4) U(50) + (3/4) U(150) (2) 
If the amounts in Bet #1 are changed to £50 and £100, while those in 
Bet #2 remain at £50 and £150, then 

(1/2) 50) + (1/2)U(100) > (3/4) U(50) + (1/4) U(150) (3) 
Adding inequalities (2) and (3), then multiplying by two gives 

U(100) + U(200) > 2U(150), which contradicts inequality (1). 
Thus Morton concludes, "No utility function of the type we have been 
considering can possibly describe such preferences. 

Mathematically speaking, relation (1) should be an equation, 
not an inequality. It should be an equation because the probabilis- 
tically-averaged gains are equal for both initial wagers. The ine- 
quality relation was employed to express a psychological preference, 
but was then manipulated as though it were purely mathematical. The 
contradiction does not arise from a reductio ad absurdum; rather, it 
inheres in the initial employment of the utility function in mutually 
inconsistent senses: the logical and the psychological. The argument 
was constructed from a faulty implicit premise; namely, that x can be 
Simultaneously equal to y, and greater than y. Nonetheless, the 
psychological preference is empirical and permissible, and one cannot 


bm, Davis, Game Theory, Basic Books Inc., New York, 1970, 
pp.63-64. 
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discount it in order salvage consistency. So the utility function, as 
constituted, remains inadequate for its purpose. 

In order to generalize certain problems, one may appeal to 
Rapoport 's exposition on the scaling of utilities. There are three 
different scales on which utilities can be measured: the ordinal, the 
interval, and the ratio. 

The ordinal scale is the weakest of the three. It employs the 
relation of transitivity, but does not assign differences of mag- 
nitude. The ordinal scale can specify only that A> B>C. It is 
invariant with respect to positive monotone transformations; i.e. if 
WA) > UB), then UAx + y) > U(Bx + y) (where x > 0). 

The interval scale is stronger than the ordinal, but weaker 
than the ratio. It can specify both transitivity and difference of 
magnitude; i.e. A> B>C and (AB »,<, or = (3-0. The interval 
scale is invariant with respect to linear transformations; i.e. if 
WUA-B > UBC), then UL(AB)x+ y) > UCB Ox + yl (where x > 0). 

The ratio scale is the strongest of the three. It can specify 
transitivity and the actual ratios of magnitude; i.e. A> B> C, and 
A/B= y, B/C= z, C/A =1/yz. The ratio scale is invariant with 
respect to similarity transformations; i.e. if U(A/B) > WBC), then 
U[(A/B)x] > U(B/Ox] (where x > 0). 

By means of these scales, one may appreciate why any constant- 
sum game can be represented as a zero-sum game. Consider the follow- 
ing example: 


Game 2.1 — A Constant-Sum Game Game 2.2 — Å Zero-Sum Game 
B B 
5,5 8,2 0,0 3:53 
A A 
7,3 4,14 2,—2 -9,9 


The payoffs in Game 2.2 were obtained by subtracting five from 
each payoff in Game 2.1. Similarly, for any constant-sum game, some 
linear transformation exists which maps it to a zero-sum game. (And 


à Rapoport, 1966, pp.24-28. 
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if either player has a winning strategy in the constant-sum game, 
then the same player has an identical winning strategy in its zero- 
sum representation.” ) Note that any such linear transformation satis- 
fies the requirements of the interval scale, but not those of the 
ratio scale. With respect to either player, the ratio of differences 
of payoffs remains constant for both games, while the ratio of 
magnitudes does not. 

Returning to the problem of the fruit, one can see that if the 
preferences are defined on the interval scale instead of the ordinal 
scale, then the circularity in preference does not proscribe a 
solution. For instance, suppose that, in addition to P's transitive 
preferences (apple > orange > pear > banana > apple), P's interval 
preference for an apple over a banana is greater than his interval 
preference for an orange over a pear. In the notation of that prob- 
len, Uy)-Uv) > Uw-Ux). Then, if offered a choice between an 
apple and an orange or a pear and a banana, it follows immediately 
that U(x) + Uy) > Ulv) + U(w; so P prefers a pear and a banana to 
an apple and an orange. 

Note that recourse to the ratio scale is not necessary for the 
solution of the above problem. Indeed, according to Rapoport, the 
measure of utilities on the interval scale is sufficient for solving 
game-theoretic problems (where solutions exist). In practice, how- 
ever, it might be difficult to establish such a measure. While most 
people can articulate preferences on the ordinal scale, it is not an 
accustomed practice to do so on the interval scale. 

Note also that the second problem, which conflates logical with 
psychological properties, is insensible to a change of scale. Even if 
the ratio of the initial preferences were specifiable, the inbuilt 
inconsistency in relation (1) would remain. The probabilistically- 
averaged ratio of 2U(150) : U(100) + U(200) is 1:1, while the psycho- 
logically-preferred ratio is x:1, where x > 1. The subsequent mathe- 
matical manipulations would yield the reciprocal ratio, 1:x, and the 


: Neumann & Morgenstern call this relation "the isomorphism of 
strategic equivalence"; 1955, p.504. 


10 Rapoport, 1966, p.28. 
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contradiction would remain. 

Thus far, one has vindicated Luce and Raiffa's criticism of the 
first of two assumptions leading to Neumann and Morgenstern's utility 
theory, by having illustrated shortcomings of the theory with respect 
to its two necessary properties (transitivity and linear transfor- 
mability). These illustrations have taken place in the mode of intra- 
personal comparison of utilities. One must also consider the mode of 
inter-personal comparison of utilities, which is not less beset by 
difficulties. 

The question of the utility of money is classic problem, which 
manifests itself in both modes. Intra-personally, it has been gener- 
ally assumed that the utility of money is a decreasing, non-linear 
function of the amount. When Daniel Bernoulli pondered the question 
in 1738, he concluded that 


"utility resulting from any small increase in wealth will 
be inversely proportional to the quantity of goods 
previously possessed." 


Empirical justifications for this assumption abound. For example: 


"H. Markowitz asked a group of middle-class people 
whether they would prefer to have a smaller amount of 
money with certainty or an even chance of getting ten 
times that much. The answers he received depended on the 
amount of money involved. When only a dollar was offered, 
all of them gambled for ten, but most settled fora 
thousand dollars rather than try for ten thousand, and 
all opted for a sure million dollars.” 


While the assumption has been generalized in economics as "the 
law of diminishing marginal utility”, the actual function to be 
employed remains quite arbitrary. Bernoulli supposed that the value 
of money is proportional to its natural logarithm, and von Neumann 
and Morgenstern partially endorsed his supposition." 


1 D. Bernoulli, "Exposition of a New Theory on the Measurement 
of Risk’, s.v. L. Sommer, Econometrica, 22, 1954, pp.23-37. 


l Cited by Davis, 1970, p.51. 


te. see L. Savage, The Foundation of Statistics, John Wiley 
& Sons Inc., New York, 1954, p.94. 


E Neumann & Morgenstern, 1955, p.629. 
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An example by Rapoport (which he left unsolved) can be used to 
illustrate the practical difference between treating the utility of 
money as a linear versus a non-linear function. A man with £1000 to 
gamble is offered odds of 100 to 1 on roulette. What amount should he 
wager in order to maximize the utility of the gamble? 

Answers to this question hinge on the utility of money. If the 
man wishes to maximize his utility, then he could adopt the classical 
but questionable "principle of mathematical expectation” (that the 
gamble with the highest expected winnings is best), ! and wager the 
entire £1000. Then he would have a 36/37 chance of losing £1000, and 
a 1/37 chance of winning £100,000. 

To maximize his utility in this linear case, he must solve the 
equation 


(36/37) (1000-x) + (1/37) (1000 + 100x) = 1000 


where x is the amount to be wagered. The solution is x = 0. So if the 
utility of money is linear, the man's optimal wager is no wager at 
all. He stands to gain nothing, and to forfeit nothing. 

In the non-linear case, if the man adopts Bernoulli's sugges- 
tion, then his optimal wager is found by solving the equation 


(36/37) In(1000-x) + (1/37) In(1000 + 100x) = 1n(1000) 


where x is the amount to be wagered and in is the natural logarithn. 
The approximate solution is x = 47.37. The man then stands to gain 
£4,737, and to lose £A7.37. 

Thus, depending upon which utility rule he follows, the gambler 
may wager all, none, or part of his money. Rules can be devised that 
prescribe the wager of any fraction thereof. And, given that the 
gambler exercises some degree of freedom of choice, he may adopt any 
of the above rules, or invent his own. One cannot presume to say 
which monetary utility function seems to be the "best". 


Rapoport, 1960, pp.119-120. 


16 E.g. see Savage, 1954, p.91. 
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The problem becomes even more open-ended in the mode of inter- 
personal comparison of utilities. In this mode, one seeks a function 
that stipulates the different values, to different players, of a 
given payoff. 

For example, suppose that two players, Å and B, are competing 
for an apple. For the inter-personal comparison of utilities, the 
weakest possible scale (the ordinal) demands that the utility func- 
tion stipulate whether the apple's value to A is greater than, less 
than, or the same as its value to B. One might assess the case in 
point, and attempt to make an evaluation. If A owns an apple orchard 
while 5 does not, one might argue that a single apple holds greater 
value for B. Then again, B may have recently consumed several apples, 
while A may be extremely hungry, in which case the apple holds 
greater value for A, his orchard notwithstanding. Or, if both A and B 
are severely allergic to apples, then the apple holds equally nega- 
tive value to both, unless one of them keeps a horse. Any number of 
heuristic arguments, and counter-arguments, can be made; but such 
argumentation is a far cry from the articulation of a well-defined 
mathematical function. 

Now, to complicate matters: suppose that both A and B prefer 
apples to oranges. The game-theoretic interval scale demands that the 
utility function stipulate whether A's preference is greater than, 
less than, or the same as B's preference. As Rapoport points out, 


". . the interval scale does not permit interpersonal 
comparison of utilities, because both the, zero point and 
the unit of this scale remain arbitrary." 


Again, in the case of money, one may attempt to fix this scale 
acccording to some rule. Suppose that players A and Bare competing 
for £100. Explicitly, if one asks whether this prize has greater 
value for A than for B, then the respective utilities of £100 seem to 
require measurement on the ordinal scale. But if one is implicitly 
asking whether this prize respresents a greater increase in wealth 
for A than for B, the respective utilities require measurement on the 
stronger interval scale. Now suppose that A is wealthy, while Bis 


17 A. Rapoport, ~Interpersonal Comparison of Utilities', Lecture 
Notes in Economics and Mathematical Systems, 123, 1975a, pp.17-43. 
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impecunious. Then, arguably, £100 has greater value for B. In gener- 
al, a fixed sum of money may have different utilities for different 


players. 18 


Then one is confronted by a previous, unresolved problem; 
namely, the . necessity of first fixing the intra-personal interval 
scale, in order to permit inter-personal comparisons. 

Owing to this kind of insuperable difficulty, ee 
utility theory draws moderate to severe criticism: 


"The problem of trying to conceptualize and apply jnter- 
personal comparisons of utility is still unsolved." 


ing. 
There is a way in which these problematic issues can be side- 
stepped, and the preferential aspect of utility theory salvaged, at 
least for game-theoretic purposes. It consists in measuring payoffs 
in units of pure utility, or utiles. Unlike the utility, the utile is 
assumed to conserve the player's preferences. Unlike utilities, 


ap: interpersonal comparison of utility has no mean- 


utiles can be compared, both intra-personally and inter-personally. 
The game-theoretic distinction between utility and utile is somewhat 
analogous to the physical distinction between weight and mass. The 
first is a relative measure; the second, absolute (ina Newtonian 
sense, at non-relativistic velocities). For the theory of games, the 
adoption of a pure utile measure restricts the range of empirical 
application, but greatly enhances the domain of theoretical develop- 
ment. And the theory, if sufficiently developed in terms of players' 
preferences, may yet discover ways of applying itself anew. | 

Now recall the second assumption of Neumann's and Morgenstern's 
utility theory, as summarized by Luce and Raiffa: 


(2) "That there are certain well-defined chance events 
having probabilities attached to them which are manipu- 
lated according to the rules of probability calculus." 


18 This is also the view of Luce & Raiffa, 1957, p.43. 


B p. Singleton & W. Tyndall, Games and Programs, W.H. Freeman & 
Co., San Francisco, 1974, p.39. 
UN arrow, 1970, p.9. 


À fuce & Raiffa, 1957, pp.371-3. 
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This assumption is not less subject to criticism than the first. One 
centre of contention lies in the plausibly-worded phrase "according 
to the rules of probability calculus", in which the definite article 
("the") implies-the existence of a singular or universally-accepted 
set of rules for the calculation of probabilities. This implication 
fails to acknowledge—or bridge—the enormous rift between two most 
general schools of probabilistic thought: the a priori (which in- 
cludes classical and Bayesian systems), andthe a posteriori (or 
frequentist interpretation) ;~ each of which assesses probabilities 
according to a different set of rules. 

While it lies beyond the scope of this enquiry to embody a 
disquisition on the philosophy of probability theory, it is minimally 
necessary to differentiate, in passing, between the two general 
schools. In so far as this enquiry has recourse to both probability 
paradigms, as occasion warrants, it seems prudent to draw a fundamen- 
tal—if limited—distinction between then. 

First, it must be said that the distinction itself is one of 
recognition rather than definition. One can equally well recognize 
four schools of probabilistic thought, or more.” And one can draw 
ever-finer distinctions between proponents of similar schools. But 
the two suffice for this purpose. 

The distinction can be drawn quite readily. Suppose two players 
wish to shoot craps in a casino. The rules are as follows: the 
shooter wagers an amount of money, then rolls a pair of dice. If he 
obtains seven or eleven on that roll, he wins; if two, three or 
twelve, he loses. If he obtains any other number, then he must roll 
the dice repeatedly until: either that number appears again, in which 
case he wins, or seven appears, in which case he loses. Suppose that 


2 Mis distinction is drawn e.g. by T. Seidenfeld, Philosophi- 
cal Problems of Statistical Inference, D. Reidel Publishing Company, 
Dordrecht, 1979. He classifies Laplace, De Morgan, Pearson, Keynes, 
Jeffreys, Carnap, Finetti, amd Savage as Bayseians; Boole, Venn, 
Fisher, Neyman, von Mises, Reichenbach, Wald, Hacking, and Kyburg as 
frequentists: pl. 


a E.g. see R. Weatherford, Philosophical Foundations of Probab- 
ility Theory, Routledge & Kegan Paul, London, 1982, pp.6ff. Weather- 
ford recognizes four types of theory: classical, a priori, relative 
frequency, and subjectivist. 
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one of the players is an a priori probabilist; the other, ana 
posteriori probabilist. 

The a priori probabilist assumes that the dice are fair, and 
calculates the-likelihood of different game-states occurring accord- 
ing to the rules of classical probability theory (by finding the 
ratio of equipossible cases to all possible cases, for each state) .4 
He finds that a pair of dice rolled simultaneously can produce 
thirty-six possible game-states: rolls of two or twelve can occur in 
only one way each; three or eleven, in two ways each; four or ten, in 
three ways each; five or nine, in four ways each; six or eight, in 
five ways each; while seven can occur in six different ways. He then 
finds the associated probabilities of obtaining each number ona 
given roll: two or twelve, 1/36; three or eleven, 2/36; four or ten, 
3/36; five or nine, 4/36; six or eight, 5/36; seven, 6/36. He then 
finds that his chances of winning on the first roll are 8/36; of 
losing, 4/36; of having to roll again, 24/36. But if he has to roll 
again, his chances of winning will vary from 2/36 to 5/36, while his 
chances of losing will remain constant at 6/36. 

The a posteriori probabilist, however, makes no assumption 
whatsoever about the "fairness" of the dice. For him, the concept of 
equipossibility has no meaning.” The a posteriori probabilist makes 
a long series of observations of the game, recording the outcome of 
each roll of the dice. After a sufficiently large number of rolls 
have been observed, he calculates the relative frequency with which 
each outcome has occurred. If the dice are "fair", then as the number 
of observations increases, the frequency distribution will tend 


2 Å rigorous justification for this method was developed by 
James Bernoulli in his Ars Conjectandi; see e.g. I. Todhunter, A 
History of the Mathematical Theory of Probability, Macmillan & Co., 
London, 1865, pp.70-73. 


5 The frequentist position was developed in order to avoid 
circular definitions and other inherent problems of classical theory. 
See R. von Mises, Probability, Statistics and Truth, Dover Publica- 
tions Inc., New York, 1981 (translation of revised edition of 1951). 
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toward the classical probability values.” If the dice are not fair 


(i.e. are "loaded"), then the weight of the loading will be reflected 
in the given frequency distribution. 

Thus, in an honestly-run game, both probabilists will agree on 
their chances of winning and losing. But in a dishonestly-run game, 
the a priori probabilist stands to be cheated, while the a posteriori 
probabilist will have fuller knowledge of the true odds. It is of 
interest that the a posteriori probabilist need know nothing of 
classical probability theory to make his assessment. The dice yield 
an empirical result; if loaded, they will not be presumed to have 
deviated from an a priori expectation. Thus the a posteriori probabi- 
list not only cannot be cheated; he also avoids making moral presump- 
tions upon the honesty, or dishonesty, of this type of game. 

One more example is instructive. Suppose the probabilists are 
about to play the children's game of Rock, Scissors, Paper. In this 
game, two players each place one hand behind their backs, then 
simultaneously present their hands in one of three configurations: a 
fist (signifying rock), a Churchillian "V" (signifying scissors), or 
a palm (signifying paper). Rock defeats scissors (by virtue of 
smashing); scissors defeat paper (by virtue of cutting); paper 
defeats rock (by virtue of enveloping). The matrix is as follows: 


Game 2.3 — Rock, Scissors, Paper 


B 
R 3 P 
R 0,0 LAL. -1,1 Å means Rock 
A S$ -1,1 0,0 Lead S means Scissors 
P Le Ske 0,0 P means Paper 


This is a two-person, zero-sum, non-co-operative game that is 
not strictly determined. The matrix of Game 2.3 not only has no 


saddle point: it is completely symmetric with respect to payoffs. 


a This consequence is explicit in James Bernoulli's Law of 
Large Numbers. Laplace also developed a method for acertaining how 
many trials are necessary to obtain a given result that lies within 
pre-assigned limits. See Todhunter, 1865, pp.548-54. 
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From either player's point of view, each row (and column) contains 
exactly one winning outcome, one losing outcome, and one drawn 
outcome. Furthermore, all non-zero payoffs are identical in absolute 
magnitude. Thus -neither player can express a logical preference for 
any row or column. 

Suppose that player A is an å priori probabilist, and that 
nothing whatever is known about player B. Player A must resort to the 
“principle of insufficient reason"; namely, that 


"alternatives are always to be judged equiprobable if we 
have no reason to expect or prefer one over another." 


So A assumes that player B will choose R, S, or P with probabilities 
of 1/3 each. A's expected utility of choosing R is then 


EUR) = (1/3) U(R,R) + (1/3) UR, 5S) + (1/3) U(R, P) 
= (1/3) (0) + (1/3) (1) + (1/3) (-1) 
= 0 


Similarly, A's expected utilities of choosing S, and P, are identi- 
cally zero. 

In such a case, Von Neumann and Morgenstern also recommend that 
A play equiprobably: 


“Thus one important consideration for a player in such a 
game is to protect himself against having his intentions 
found out by his opponent. Playing several different 
stratgies at random, so that only their probabilities are 
determined, is a very effective way to achieve a degree 
of such protection: by this device the opponent cannot 
possibly find out what the player's strategy ig going to 
be, since the player does not know it himself." 


This is a compelling argument, which holds as long as the opponent is 
also playing with uniform randomness. Indeed, if both Å and B proceed 
to play as such, then over the course of many plays, they will each 
tend to win one third of the games, lose one third of the games, and 
draw one third of the games. 

And in this case, the a posteriori probabilist, observing that 
the relative frequencies with which B chooses R, S, and P are ap- 


7 Weatherford, 1982, p.29; see also Luce & Raiffa, 1957, p.284. 


a Neumann & Morgenstern, 1955, p.146. 
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proximately 1/3 each, adopts the same strategy of random play, and 
fares the same as the a priori probabilist. As long as B plays with 
uniform randomness, both probabilists achieve the same result. 

But suppose that B plays with non-uniform randomness; i.e. that 
the a priori probabilities of his choices are weighted. Let the 
weights be such that B choses R with probability 1/2, and Sand P 
with probabilities 1/4 each. 

(One can posit any number of plausible reasons for the uneven 
weightings. For example, suppose that B intends to play with uniform 
randomness by rolling a die, and that he will choose R if he rolls 
one or two; S if he rolls three or four; and P if he rolls five or 
six. But unknown to B, he uses a die that is "loaded" to yield one 
and two with probabilities 1/4 each; and to yield three, four, five 
and six with probabilities 1/8 each. Then the above distribution 
would obtain.) 

If B plays according to these weights, and A plays witha 
priori, uniform randomness, then the matrix of probabilities for each 
outcome is as follows: 


Game 2.4 — Weighted Probability Matrix for Rock, Scissors, Paper 
Player A: pR = p9 = KP = 1/3 
Player B: p(R = 1/2; p(9) = p(P = 1/4 
B 
pm) på) pP 
P(R) 1/6 1/12 1/12 
A på) 1/6 1/12 1/12 
P(P) 1/6 1/12 1/12 


Unknown to player A—who is not recording the relative frequen- 
cies of B's choices—his expected utilities for Game 2.4 are now 
EUR) = p(R,R) UIR, R) + p(R,S)UCR,S) + p(R.P) UR, P) 

(1/6) (0) + (1/12)(1) + (1/12) (-1) 

0 

P(S,R)UCS,R) + p(S,S)UCS,S) + p(S,P) US, P) 
(1/6) (-1) + (1/12) (0) + (1/12) (1) 

—1/12 


EU(S) 
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EU(P) = p(P,R)U(P,R) + pP, UIP, S + p(P,P)UP,P) 
(1/6) (1) + (1/12) (-1) + (1/12) (0) 

= 1/12 

Although A's set of expected utilities in Game 2.4 differs from 
that of Game 2.3, A's average result is identical in both cases. 
After a large number of plays of Game 2.4, he will have won one third 
of the games, lost one third of the games, and drawn one third of the 
games, for a net average gain of zero utiles. 


Now suppose that the a posteriori probabilist takes his turn as 
player A. He has been observing player B, and has recorded the 
relative frequencies of B's choices. The a posteriori probabilist 
counters B's weighted play with a weighting of his own. As player Å, 
he makes random choices with weighted probabilities one-quarter for 
Rock and Scissors, and one-half for Paper. The new probability matrix 
is as follows: 


Game 2.5 — Re-Weighted Probability Matrix for Rock, Scissors, Paper 
Player A: p( Å = p( 5) = 1/4; pP = 1/2 
Player B: (hM = 1/2; p(S) = XP = 1/4 


PIR) PO pP 

P(R) 1/8 1/16 1/16 

A pS) 1/8 1/16 1/16 
pP) 1/4 1/8 1/8 


For Game 2.5, player A's expected utilities are 
EUR) = p(R,R UR, R) + p(R,S)UR,S) + pi R, P) UR, P) 

= (1/8)(0) + (1/16) (1) + (1/16) (-1) 

= 0 
EUS) = p(S,R)U(S,R) + p( 5,5) US, S) + p(S,P)U(S,P) 

= (1/8) (-1) + (1/16) (0) + (1/16) (1) 

= —1/16 
EU(P) = p(P,R)U(P,R) + p(P,S)|U(P,5) + pi P,P) UP, P) 

= (1/4) (1) + (1/8) (-1) + (178) (0) 

= 1/8 
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This result certainly favours player A. On average, Å wins 
three-eighths of the games, loses five-sixteenths of the games, and 
draws five-sixteenths of the games. The average quantity won exceeds 
the average quantity lost by one-sixteenth of a utile. After a large 
number of plays of Game 2.5, A's net average gain will be one utile 
for every sixteen plays. 

Again, the a posteriori probabilist need not impute any mo- 
tives, whether logical or psychological, to account for and to 
counter player B's weighted choices. Just as in the example of the 
dishonestly-run casino, A's observation of the relative frequency of 
events is a value-neutral process. 

The purpose of these two examples is most assuredly not to make 
a case for the relative superiority of one school of probabilistic 
thought over another; rather, it is to argue that the outcomes of 
certain games can be affected by a particular choice of probabilistic 
paradigm on the part of the player. 

For games involving random (or pseudo-random) moves, a player 
must assign some probabilistic distribution to outcomes in order to 
calculate the expected utilities of different choices. It has been 
illustrated that, in some cases, the results of a priori and a 
posteriori probability assignments are convergent. When the two 
methods do not converge, it has been shown that the player who 
employs an a posteriori calculus may forfeit less, or gain more, than 
one who employs an a priori calculus. 

The objection can be made that no example was given which 
explicitly favours the a priori over the a posteriori method. The 
latter method is not without potential shortcomings, one of which can 
be illustrated in the following way. 

Suppose two players are "matching pennies”. Player A first 
predicts either "even parity" (two heads or two tails) or "odd 
parity" (one head and one tail). Next, they each flip a penny and 
allow the coins to fall. If player A predicted the outcome correctly, 
he wins; if not, he loses. The matrix is as follows: 
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Game 2.6 — Matching Pennies 


B 
E, O, E means even parity" 
E 1,-1 -1,1 O means "odd parity" 
A p-subscript means"predicted' 
Q -1,1 1,-1 o-subscript means “occurred" 


Suppose that player À is an a priori probabilist, and suppose 
also that both coins are fair. Then the probability of each outcome 
is 1/4. Player A constructs a probability matrix: 


Game 2.7 — Probability Matrix for Matching Pennies 
Player A: p(H) = 1/2, p(T = 1/2 
Player B: p(H = 1/2, pg D = 1/2 


B 
pH) pD 
D(A) 1/4 1/4 
A p(H) means ‘probability of heads’ 
PIT) 1/4 1/4 p(T) means “probability of tails“ 


Since A predicts even and odd parity with random probabilities 
of one-half each, his prediction percentage is approximately fifty 
percent correct over a large number of | games. On net average, he 
neither wins nor loses. 

Now suppose that player Å is ana posteriori probabilist. 
Suppose also that both coins are fair, but that A does not make a 
sufficiently large number of observations. Let him make ten observa- 
tions, in which he finds his coin to have landed "heads" four times, 
and "tails" six times; and in which he finds B's coin to have landed 
"heads" seven times and "tails" three times. Now let A conclude from 
these observations that his coin is weighted 6:4 in favour of tails, 
while B's coin is weighted 7:3 in favour of heads. Å then constructs 
a (fallacious) relative frequency matrix, based upon his limited 
observations: 
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Game 2.8 — Fallacious Relative Frequency Matrix 
Player A: f(H) = 4/10, f(T) = 6/10 
Player B: HH) = 7/10, AD = 3/10 


B 
£(H) HD 
f(H) 28/100 12/100 
A f(A) means frequency of heads 
f(T) 42/100 18/100 f(T) means “frequency of tails’ 


From this matrix, A finds that the combined relative frequency 
of even parity is (28+18)/100 = 46/100, while that of odd parity is 
(42+12)/100 = 54/100. Thus A concludes that he should play randomly 
but not uniformly, weighting his predictions to favour even parity in 
forty-six of each one hundred subsequent games, and odd parity in 
fifty-four. 

Consequently, after a large number of subsequent games, A's 
prediction percentage is only about forty-two percent correct. On net 
average, he loses eight utiles per hundred games. 

When interpreting å posteriori probabilities, then, it is of 
paramount importance to ensure that an observed relative frequency 
attains a limiting value.” This A failed todo, by observing an 
insufficient number of events. 

Furthermore, it is not always feasible to employ the a posteri- 
ori method. In countless situations where one must take a decision 
under risk or conflict of interest, without the benefit of a suffi- 


ciently lengthy series of observations of outcomes in similar situa- 


3 If it is to meet the von Mises criterion of randomness, this 
value must be independent of any "selection rule" for the observed 
events. For instance, a "fair" coin will land "heads" in about fifty 
percent of trials—given a large enough number of trials—and this 
limiting value should be obtainable from any large sub-sequence of 
the observed trials, according to any rule of place-selection. For 
instance, the relative frequency of "heads" in even-numbered trials, 
odd numbered-trials, prime-numbered trials, i trials, etc., should 
have a limiting value of fifty percent, within confidence limits 
defined by appropriate statistical tests. See Mises, 1981, pp.87-9. 
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tions, the a posteriori calculus is inapplicable. Then one has 
recourse to classical, or to Bayesian, or to subjective interpreta- 
tions. Not every calculus is available in every qgame-theoretic 
situation. 

While the introduction of a pure utility measure allows one to 
evade unsolved problems of comparing utilities (at the cost of 
restricting the applicability of the theory), classes of games exist 
in which one cannot avoid the employment of a probability calculus. 
The difficulty then lies in selecting an appropriate calculus for the 
given situation, and is compounded by the fact that each school of 
probabilistic thought admits of particular strengths and weaknesses. 

It can be seen that the von Neumann-Morgenstern utility func- 
tion is formed by the concatenation of two problematic calculi: one 
of preferences, the other of probabilities. That both are subject to 
criticisms seems clear enough. The severest criticism, though, is not 
necessarily the most instructive. One has it from Savage that 


"The postulates leading to the von Neumann-Morgenstern 
concept of utility are arbitrary and gratuitous." 


That they are rich in controversy is apparently beyond dispute. 
Utility theory, however incompletely formulated, remains indispen- 
sable to game theory. 

And one further concept, not less dispensable but perhaps more 
controversial, requires elaboration in this game-theoretic back- 
ground; namely, the concept of rationality. 


A Savage, 1954, p.99. 
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Chapter Three 
Game-Theoretic Rationality 


Before delving into the intricacies of the Prisoner's Dilemma, 
it is necessary to review the concept of rationality in a game— 
theoretic context. The concept is laden with difficulties, but must 
be addressed; for it is of central importance to both game theory in 
general and the Prisoner's Dilemma in particular. 

An assumption about rational choice was ineluctably smuggled 
into the synopsis, in Chapter One, of the property of strict deter- 
minateness. It was implicitly assumed that a so-called "rational" 
player would select that row (or column) of a game matrix which 
contains a saddle point, if indeed such a point exists in the given 
game. Recall that the grounds for this assumption were that if the 
so-called "rational" player chooses the minimax (or maximin, as the 
case may be), then he can fare no worse in that game, regardless of 
whether his opponent plays "rationally" or "irrationally". The 
example was given in order to illustrate the importance of the saddle 
point; its corollary implication, however, was that a "rational" 
player always plays minimax (or maximin, as the case may be), while 
an "irrational" player may not always do so. The soundness of this 
implication must now be called into question. 

Let one commence with the Von Neumann-Morgenstern caveat to 
their qualification of rationality: 


"The individual who attempts to obtain these respective 
maxima [maximin and minimax] is also said to act “ratio- 
nally'. But it may be safely stated that there exists, at 
present, no satisfactory treatment of the question of 
rational behaviour." 


Rapoport expresses the wish to modify the caveat itself, by denuding 
game theory of "psychological" overtones.” The term "behaviour" is 
connotative of psychology, and Rapoport argues that psychological 


l Neumann & Morgenstern, 1955, p.9. See also Luce & Raiffa, 
1957, p.5. 


? Rapoport, 1966, p.103. 
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orientations of "rational. players" should be irrelevant in a formal 
game—theoretic context .? 

One reason for this viewpoint is as follows: in an idealized 
situation, just -as one can compare utilities in units of pure utiles, 
so it would be convenient to define a "rational player" in a way that 
depends purely on his play. As the ideal unit of utility, the utile 
orders the values of preferences by mapping them to the real numbers. 
It transcends the intransitivity of circular preferences, and permits 
the interpersonal comparison of utilities. Similarly, the ideal 
definition of rationality would map each play to a Boolean statement, 
either "rational" or "irrational", in a way that transcends the 
psychological motives of the players. Can such a definition be 
articulated, even in the ideal case? 

In the game of poker, it is often useful to employ the tactics 
of "bluffing" and "sandbagging”, which entail, respectively, the 
occasional over-playing of weak hands, and under-playing of strong 
hands, in order to mislead one's opponents. These tactics are work- 
able because poker is a game of imperfect information.’ As such, the 
outcome of a given hand does not necessarily depend on the cards that 
the players are actually holding, and frequently depends rather upon 
the fictitious cards that they believe one another to be holding. 

Suppose a player decides to bluff on a weak hand. He wagers 
increasingly large amounts of money on his cards, as though he held a 
strong hand. If his bluff is not "called", then the bluffer wins with 
a hand that would normally have lost; and the losing players, who did 
not pay to view his cards, might assume that he did indeed holda 
winning hand. 

But if his bluff is "called", the bluffer must reveal his weak 
hand to the players who have matched his wager. They immediately 
realize that he was attempting to bluff. The bluffer thus loses the 
hand in question, but sets a potentially lucrative precedent in the 
process. For when he next holds a very strong hand, he may again 


3 id. 


Simply stated, a game of imperfect information is a game in 
which some moves are concealed. E.g. see Neumann & Morgenstern, 1955, 
pp.51-52 ff. 
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wager large amounts of money (perhaps feigning nervousness as he does 
so), in order to induce the other players into believing that he is 
once again attempting to bluff. They may match his wager amd call 
what they suppose to be his "bluff", only to find that he has not 
been bluffing on this occasion. 

Thus the astute poker player is willing to lose one or more 
hands quite deliberately, in order to potentiate a future situation 
in which he expects not only to recoup his previous losses, but also 
to realize a net gain. 

If one defines poker-theoretic rationality as the wish to 
maximize one's overall winnings (or minimize one's overall losses), 
then it is also poker-theoretically rational to employ the tactic of 
bluffing from time to time (although game theory can prescribe 
neither the frequency nor the cost of the tactic). Then, if a player 
loses a given hand because his bluff has been called, he is not 
irrational, but perhaps ambitious. Suppose another player loses 
several hands in this fashion, but the game ends before he can recoup 
his losses. That player is not irrational, but perhaps unfortunate. 
Suppose another player wins the game without ever having bluffed. 
That player is not irrational, but perhaps fortunate. And suppose 
another player loses all his money, without ever having bluffed. That 
player is not irrational, but perhaps unskilled. Suppose a player is 
Winning by a substantial amount, but wagers this entire amount on the 
final hand, and loses. That player is not irrational, but perhaps 
avaricious. Thus no poker player is irrational, if he wishes to win 
in the long run. 

However, this definition of rationality is the antithesis of 
the sought-after "ideal" definition, because it depends not at all on 
the play and hinges solely upon the motives of the players. 

Yet it does not seem at all sensible to alter the working 
definition of rationality with respect to poker, by claiming that it 
is rational to seek to win as much as possible, or to lose as little 
as possible, on each individual hand. While this new working defini- 
tion would conform to the ideal, by assessing the play and discount- 
ing the motives of the players, it could prove paradoxical. Suppose 
the overall winner of a poker game turns out to bea player who 
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bluffs quite frequently. The working definition labels him as "ir- 
rational", yet he fares better than the "rational" player. A defini- 
tion of rational play that both urges a player to win by rational 
means and acknowledges the potential superiority of irrational means, 
is self-contradictory and therefore unsatisfactory. 

At first blush, this state of affairs may seem to arise because 
poker is agame of imperfect information which is not strictly 
determined. (Recall that in a strictly determined game, a player who 
chooses the minimax fares even better if his opponent does not choose 
the maximin; and a player who chooses the maximin fares even better 
if his opponent does not choose the minimax.) In a game without a 
saddle point, the "rational" player has no inherent defense against 
an "irrational" player, if rationality means maximizing gains or 
Minimizing losses on every play. 

However, it can be demonstrated that the paradox is a conse- 
quence, not of imperfect information and the absence of strict 
determination, but of the attempt to articulate an ideal definition 
of rationality. Consider chess, which is a strictly determined game 
of perfect information.) A chess game is either won, lost, or drawn, 
according to the disposition of the pieces, which are always in plain 
view of the players. The tactic of bluffing would seem to have no 
relevance in this game. 

In world championship chess, a match is the best of twenty-four 
games (in each of which a player receives one point for a win, no 
points for a loss, and one-half point for a draw). Thus, the first 
player to attain twelve-and-one-half points is the victor. The 
working definition of rationality, which proved paradoxical in poker, 
prescribes that the rational player attempt to win as many chess 
games as possible, and lose as few as possible, in order to win the 
match. 

If that seems reasonable, then consider what actually took 
place in the 1972 world championship match in Reykjavik, between 


5 Simply stated, a game of perfect information is a game in 
which no moves are concealed from any player. 


If the score is tied at twelve points each, then the incumbent 
champion retains the title. 
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Bobby Fischer and Boris Spassky.’ 


written assent to the presence of television cameras, found that 


Fischer, who had given his prior 


their proximity interfered with his concentration. He therefore 
refused to play until the cameras were removed, and remained in his 
hotel room when the match officially commenced. Fischer's apparent 
"bluff" was called, and he proceeded to forfeit the first two games 
of the match. An accommodation was then reached, and Fischer played 
in subsequent games. Spassky held the initial lead of two games to 
none (a considerable advantage at this level of competition), but was 
unnerved by Fischer's cold-blooded forfeitures. Fischer eventually 
won the match with brilliant play, while Spassky made several blun- 
ders unworthy of a player of his stature. 

According to the working definition of rationality, Fischer 
played irrationally in the first two games, by losing them delibe- 
rately. A "rational" player would have elected to play under condi- 
tions of slightly impaired concentration, because he could not have 
fared worse by playing, and might indeed have fared better. But in 
retrospect, Fischer's “irrational play" in the first two games was an 
ingredient of his eventual victory in the match. 

One seems obliged to concede that, whether the game is one of 
perfect or imperfect information, and whether strictly determined or 
not, a certain number of losses may conduce to an overall win in the 
long run. In that case, one cannot demand, by definition, that a 
"rational" player seek to maximize his wins, and minimize his losses, 
at every opportunity. But then one cannot define rationality. in terms 
of the play itself, and one is thrown back upon the undesirable 
necessity of gauging rational, or irrational play, in terms of the 
motives of the player. 

At this juncture, one might argue that the problem stems not 
from the working definition of rationality as such, but from the 


i E.g. see C. O'D. Alexander, Fischer v. Spassky: Reykjavik 
1972, Penguin Books Ltd., Harmondsworth, 1972. 
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failure to draw a categorical distinction between games and meta- 
games .* 

Poker is a meta-game, in that a "game" of poker consists of 
many hands. Each hand may be evaluated as a separate game, and the 
meta-game outcome is the algebraic sum of the outcomes of the hands. 
Similarly, a chess match is a meta-game consisting of many games of 
chess. The outcome of the match is the algebraic sum of the outcomes 
of the games. Given this distinction, is it possible to formulate an 
ideal definition of rationality which takes into account that delibe- 
rate losses of a game (or games) may still conduce to victory in the 
associated meta-game? 

The distinction between games and meta-games necessitates a 
Similar distinction between move and strategy, in the sense that a 
losing move ina given game may form part of a winning strategy in 
the associated meta-game. In that case, rationality is embodied not 
in the move itself, but rather in the strategy that gives rise to the 
move. Thus, one can attempt the following reformulation: the ideal 
definition of rationality would map each strategy (instead of move) 
to a Boolean statement, either "rational" or "irrational", in a way 
that transcends the psychological motives of the players. The ques- 
tion is, can one infer the rationality (or irrationality) of a player 
merely by observing his strategy? If so, then "rationality" is 
ideally defined. 

Unfortunately, the answer to the question seems to be: not 
‘necessarily. Consider this example. Suppose a wealthy but eccentric 
sportsman sponsors a poker game according to the following rules: 
each player begins the game with £1000. There is a maximum bet of £5 
and one raise per hand. The first player to Jase £1000 wins a prize 
of £10,000. Now suppose a game-theorist, who is unaware of the meta- 
game situation, observes the play of several hands. Based on the 
strategies he observes, he may speedily conclude that the players are 
irrational (if not utterly mad). But if the dgame-theorist were 
informed that the first player to lose £1000 in the poker game wins 


ê Formal Meta-game theory was developed by N. Howard, Paradoxes 
of Rationality: Theory of Metagames and Political Behaviour, The 
M.I.T. Press, Cambridge, Massachusetts, 1971. 
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£10,000 in the meta-game, he could conclude from the same observa- 
tions that the players are quite rational. Thus one cannot always 
infer a player's rationality (or irrationality) from his game stra- 
tegy alone; one may also require knowledge of the rules of an as- 
sociated meta-game, in order to draw such an inference. 

It seems that an ideal definition of rationality cannot always 
be based solely upon the game strategy of a player; it must also take 
into account the rules of the associated meta-game. And, to further 
complicate matters, while many players may be involved in the same 
game, with each player may be associated a different meta-game. The 
game-theorist cannot infer a player's rationality unless he knows the 
rules of the particular meta-game associated with that player. 

Consider, for instance, the hypothetical case of a wealthy 
poker player who loses money deliberately to his fellow poker-play- 
ers, as an act of charity. He may, inadvertently, win a few hands in 
the process; but his meta-game rule is to maximize his long-term 
losses. Suppose the other players are playing "normally"; that is, 
they share the meta-game rule of attempting to maximize their long- 
term winnings. If the game-theorist observer is unaware of the 
charitable player's meta-game rule, he might infer, based on his 
observations of strategy, that the player is irrational. But if made 
aware of the charitable player's rule, he would infer from his 
observations that the charitable player is indeed rational. 

Note that one does not need to know the actual motives of the 
player in order to draw such an inference. It is not necessary that 
the game-theorist be told that the player in question is motivated by 
charitability; he need only know whether the player's meta-game rule 
is the long-term maximization of winnings, or of losses. If that 
player's game-strategy is consistent with his meta-game rule, then 
that player may be called "rational"; if not, then "irrational." 

Note also that the other players inthis hypothetical game, 
though they share an identical meta-game rule (the maximization of 
long-term winnings), may do so for completely different reasons. One 
player may wish to purchase a gift for his wife; another may wish to 
make a donation to medical research; a fourth may wish to pay for 
music lessons for his child. Again, the game-theorist does not need 
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to be told what motivates these players; he need only know that their 


meta-game rule is the maximization of long-term winnings. Ifa 
player's game strategy is consistent with his rule, then that player 
may be called "rational"; if not, then "irrational". 

This dyadic definition of rationality, which assesses consis- 
tency between a game strategy and its associated meta-game rule, 
entails no moral judgement concerning intra-personal motives, nor 
does it attempt an inter-personal comparison of motives. It satisfies 
Rapoport's demand that the psychology of a player be excluded from 
consideration of his rationality. 

From the foregoing example, it is clear that one cannot infer a 
player's meta-game rule simply by observing his game-strategy. While 
the losing strategy of the charitable poker-player is consistent with 
his meta-game rule of maximizing long-term losses, 


losing strategy could also be adopted by an irrational 


an identical 
player whose 
meta-game rule is to maximize his long-term winnings. The observer of 
these players would err by inferring the identity of their meta-game 
rules from the identity of their game strategies. Of course, if the 
observer were told that one of the players is rational, and the other 
irrational, he could then infer that their meta-game rules are 
different. But he could not identify the rational (or the irrational) 
player without knowing which meta-game rule a particular player 
obeys. 

In order to ascertain whether a given player is rational or 
not, the observer can construct a meta-matrix for that player, as 
follows: 


Game 3.1 - Observer's Meta-Matrix for Player A 
A's Meta-Game Rule 


Maximize Maximize 
Winnings Losses 
Winning player A player Å is 
Strategy is rational irrational 
A's Game Strategy 
Losing player Å is player A 
Strategy irrational is rational 
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Thus far, it has been possible for the observer to discern 
between winning and losing game strategies, independently of his 
knowledge (or lack thereof) about a player's meta-game rule. It is 
possible to conceive of a worse case, however, in which the observer 
cannot discern between winning and losing strategies purely from the 
context of the game. Would such a case preclude the construction of a 
meta-matrix, and thus prevent him from assessing a player's rationa- 
lity? 

Reconsider, for example, the game of Rock, Scissors, Paper. It 
has been established that, if both players are a priori probabilists, 
they should both adopt a mixed strategy of uniform random play. Then, 
over the course of a large number of games, both players' net scores 
will tend toward zero. In the prior consideration of this game, it 
was tacitly assumed that both players obeyed a meta-game rule of 
maximizing their long-term winnings (or, equivalently in this class 
of game, of minimizing their long-term losses). 

But suppose both players now obey a meta-game rule of minimiz— 
ing their long-term winnings (or, equivalently, of maximizing their 
long-term losses). Instead of attempting to win as often as possible, 
both players are now (for some plausible reason) attempting to lose 
as often as possible. What strategies should they adopt? 

If player A wishes to lose and player B wishes to win, player A 
would choose a pure strategy of either Rock, Scissors, or Paper. 
Suppose he chooses Rock. Player B would soon respond with a pure 
strategy of Paper. Player A would lose, and player B would win, every 
game thereafter. Thus each would satisfy his respective meta-game 
rule (and both would be rational to the game-theoretic observer). 

However, if both players wish to lose, then A cannot adopt a 
pure strategy. (If he did so, again choosing Rock, then B would soon 
respond with a pure strategy of Scissors, and A would win every game 
thereafter.) If both players wish to lose, then they must each adopt 
a mixed strategy of uniform random play. This strategy is therefore 
degenerate: it is best both for mutually-desired long-term wins and 
for mutually-desired long-term losses. Thus the game—theoretic 
observer cannot ascertain, purely from the context of the play, 
whether both players obey a meta-game rule that maximizes wins, or 
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losses. Nevertheless, the observer can readily construct a meta- 


matrix for either player, as follows: 


Game 3.2 — Observer's Meta-Matrix for Player A 
(with degenerate winning/losing stragegy) 
A's Meta-game Rule 


Maximize Maximize 
Winnings Losses 
Mixed, Uniform player A player A 
Random Strategy is rational is rational 
A's Game Strategy 
Pure or Non- player A is player Ais 


Uniform Strategy irrational irrational 


Strategic degeneracy does not affect the observer's ability to 
assess the rationality of either player, according to the working 
definition of rationality under consideration. 

Do situations arise which demand more of this concept of 
rationality than it can afford? Apparently, they do. The working 
definition becomes less workable in the following examples. 

Suppose player Å is both very fond of strawberries and mildly 
allergic to them. He derives considerable pleasure from eating 
strawberries, but suffers a temporarily uncomfortable though other- 
wise harmless allergic reaction after eating them. If Ais offered 
strawberries and declines them, is he rational? Certainly, if his 
meta-game rule prescribes the avoidance of discomfort whenever 
possible. But if A is offered strawberries and accepts them, is A 
irrational? Certainly not, if his meta-game rule permits the indul- 
gence of a gustatory pleasure with the consequence of a mild discom- 
fort. 

Now suppose that A is offered strawberries at consecutive 
meals; he declines them at breakfast, but accepts them at lunch. What 
can a game-theoretic observer infer about A's rationality? He can 
infer that, if A's meta-game rule was the same for both meals, then A 
was rational at one meal and irrational at the other. He can also 
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infer that, if A's meta-game rule was not the same for both meals, 
then A was either rational at both meals, or was irrational at both. 
This example is one in which the meta-game rule does not 
necessarily remain fixed or constant throughout the duration of the 
meta-game itself, but is subject to change according to the shifting 
preferences of the player. Meta-game theorist Howard puts forward a 
definition of rationality based squarely upon this premise: 


"We say that rational behaviour consists in choosing the 
alternative one prefers." 


The working definition under consideration here is consistent with 
Howard's. A meta-game rule orders a player's preferences, while a 
game strategy chooses that alternative which reflects the ordering 
(if the player is rational), or which does not reflect it (if the 
player is irrational). 

The problem is that the player's rationality can be assessed 
only if the observer is informed of every shift in the player's 
preference. 

Now suppose the observer is player B ina game of imperfect 
information without a saddle point, in which changes in the players' 
preferences are mutually concealed. In that case, the rationality or 
irrationality of each player is indeterminate with respect to the 
other. This situation is worse than that of a game of perfect infor- 
mation without a saddle point, in which player B can be harmed by 
player A's irrationality (and vice-versa). In a game of imperfect 
information without a saddle point, player B can be harmed not only 
by A's irrationality, but also by B's possible mistaking of A's 
actual rationality for apparent irrationality (and vice-versa). 

The first example of the indeterminacy of rationality takes 
place in an inter-personal context; that is, each player behaves 
either rationally or irrationally ("behaves" in Howard's sense, by 
choosing the alternative he prefers), but neither player can infer 
the rationality or irrationality of the other. 

A second example, the indeterminacy of whose expectations is 
well-known to game theorists, takes place intra-personally. It 


3 Howard, 1971, p.xx. 
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involves Pascal's question of whether to subscribe, or not to sub- 


scribe, to Roman Catholic theology. Pascal's matrix is as follows: ! 


Game 3.3 — Pascal's Question 
State of Nature 


God exists God does not exist 
Practice eternal reward pious life only 
Catholicism 
Pascal's 
Decision 
Not Practice eternal punishment impious life only 
Catholicism 


In games against a state of nature, the player does not know 
which state actually obtains, but assumes the actual state to be 
causally independent of his beliefs. In order to calculate his 
expected utilities, the player normally assigns a probability dis- 
tribution to the states of nature." The result of Game 3.3 is in- 
determinate in terms of expected utilities, owing to the infinite 
positive and negative payoffs associated with eternal reward and 
punishment, respectively, and the ensuing transfinite arithmetic. 

But the concern here is not with the utility of Pascal's 
decision; rather, with its rationality. If Pascal decides to believe 
in the Catholic deity's existence (as a meta-game rule), then he 
would be rational to practice Catholicism (as a game strategy). 

But would Pascal be irrational to believe in sucha deity's 
existence and not practice Catholicism? Not necessarily. If Pascal is 
a fatalistic theist, he might believe that his decision is pre- 
ordained. But if Pascal's decision is causally pre-determined by a 
deity, then it is not solely Pascal's decision. And if Pascal cannot 
make a free choice, then the meaning of the rationality or irrationa- 


0 Variants of the matrix can be found e.g. in Jeffrey, 1965, 
p.12; and Howard, 1971, p.7. 


1 Pascal, for instance, assigned a subjective probability of 
.00001 to the state in which God exists; see Jeffrey, 1965, p.12. 
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lity of his choice alters drastically. And if the deity in which 
Pascal believes allows him a choice, then Pascal is still not neces- 
sarily irrational not to practice Catholicism. For Pascal may prefer 
to sin now, and.to seek absolution or redemption later. 

On the other hand, if Pascal decides not to believe in the 
Catholic deity's existence (as a meta-game rule), then he would be 
rational not to practice Catholicism (as a game strategy). 

But would Pascal be irrational not to believe in the deity's 
existence and to practice Catholicism anyway? Again, not necessarily. 
Given that his belief need not be absolute, Pascal may simply doubt 
the existence of such a deity, while practicing Catholicism in order 
to "hedge his bet". Or, Pascal's disbelief may be absolute, and he 
remains in a state of atheism, but practices Catholicism publicly to 
protect himself in the event of an Inquisition. 

The example of Pascal is meant to illustrate that, no matter 
what the player's beliefs in a game against nature, arguments can be 
found which support the "rationality" of any personal decision he 
takes. But if a distinction cannot be drawn between rationality and 
irrationality, then the game-theoretic concepts are indeterminate in 
this context. 

Thus the working definition of rationality (consistency between 
a player's meta-game rule and his game strategy), which satisfies 
Rapoport 's game-theoretic criterion of independence from psychology 
and Howard's meta-game-theoretic criterion of choosing the alterna- 
tive that one prefers, is nct universally applicable. Classes of 
inter-personal games exist in which neither player can ascertain the 
other's rationality, or irrationality; and classes of intra-personal 
games exist in which the player cannot discern between rational and 
irrational choice. 

But the problems of game-theoretic rationality hardly end 
there. As will be seen next, one dimension of the Prisoner's Dilemma 
—and arguably the most significant dimension with respect to con- 
flict resolution—entails divergent meanings of rational choice. In 
games considered thus far, rationality and irrationality have been 
associated with (and, where definable, defined in terms of) the in- 


dividual player. But with reference to the Prisoner's Dilemma, 
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"that the concept of rationality should be re-examined, 
perhaps split into two concepts, individual rationality 
and collective rationality.” 


A re-examination of the Prisoner's Dilemma will certainly bear out 
the cogency of Rapoport's suggestion. 

Sufficient essentials of game theory have been reviewed to 
enable such a re-examination. With these basic necessities in hand 
(an understanding of principal taxonomic criteria, and an apprecia- 
tion of the range of difficulties latent in utility theory and game- 
theoretic rationality), one is minimally equipped to consider some of 
the complexities in the Prisoner's Dilemma. 


l Rapoport (ed.), 1974, p.4. 


PART TWO: 


THE STATIC PRISONER'S DILEMMA 
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Chapter Four 
Conflicting Choices and Rationalities 


Game-theoretic literature attributes the original Prisoner's 
Dilemma to A.W. Tucker. Ås to its early development, Rapoport 
narrates: 


"To my knowledge, the earliest experiments with Priso- 

ner's Dilemma were performed by Flood in 1952. . .and do 

not seem to have attracted much attention at the time. 

. The ~paradox' was discussed by several of the Fellows 

at the Centre for Advanced Study in the Behavioural 

Sciences in Palo Alto during the first year of its 

operation (1954-55). . .Possibly a decisive impetus to 

experimental work was given by a paper by Schelling, 
published in 1958. At any rate, it seems that the first 

experiment since Flood's was performed by Deutch in 1958. 

Thereafter the number of experimenta} papers on Priso- 

ner's Dilemma increased very rapidly." 

Both theoretical and experimental interest in the Prisoner's 
Dilemma are stimulated by the model's structural properties. Asa 
non-zero-sum, non-co-operative game, the Prisoner's Dilemma resists 
absolute theoretical prescriptions as to the "best" line of play. In 
consequence, a limitless range of experiments can be conducted, whose 
results may correlate with a wide variety of factors, from differing 
characteristics of the players to variants of the game itself. 

The Prisoner's Dilemma can be played in both the static and the 
iterated modes. Logically and chronologically, the former gives rise 
to the latter, so it is mete to commence with the former. As von 
Neumann and Morgenstern declared in their general theory of two- 


person, zero-sum games: 


| E.g. Luce & Raiffa, 1957, p.94; Rapoport & Chammah, 1965, 
p.24; Singleton & Tyndall, 1974, p.101. 


2 Rapoport (ed.), 1974, pp.19-20. The papers to which Rapoport 
refers are: M. Flood, “Some Experimental Games’, Research Memorandum 
RM-789, The Rand Corporation, Santa Monica, 1952; T. Schelling, ~The 
strategy of Conflict: Prospectus for the Reorientation of Game 
Theory', Journal of Conflict Resolution, 2, 1958, pp.203-64; M. 
Deutch, “Trust and Suspicion’, Journal of Conflict Resolution, 2, 
1958, pp.267-79. 
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"We make no concessions. Our viewpoint is static and we 

are analyzing only a single play." 

Indeed, numerous aspects of game-theory were subsequently developed, 
maintaining theoretical pace with the empirical transition from 
static to iterated non-zero-sum games. An understanding of the static 
Prisoner's Dilemma is a prerequisite for an appreciation of the 
increased complexities of iterated Prisoner's Dilemmas. 

The static Prisoner's Dilemma arises from a particular type of 
scenario, many versions of which are rehearsed in the literature. 
Though the model has been embellished in a variety of ways, varia- 
tions in the narrative details do not alter the problem itself, which 
inheres in specific properties of the game-matrix. 

One version, then, is as follows: suppose two suspects are 
arrested, held incommunicado, and interrogated. Call them prisoner A 
and prisoner B. Each prisoner faces an identical choice: he can 
either divulge evidence against his fellow-prisoner, or refuse to do 
so. Since each prisoner must make a choice, the prisoners will thus 
generate a joint outcome, but without collusion. Both prisoners are 
made aware of the payoffs of each possible outcome, which are: 

(1) If both Aand Brefuse to divulge evidence against one 
another, they will both be set free. 

(2) If A divulges evidence against Band B does not divulge 
evidence against A, then A will be given a bribe and set free, while 
B will serve a heavy sentence. 

(3) If B divulges evidence against A and A does mt divulge 
evidence against B, then B will be given a bribe ard set free, while 
A will serve a heavy sentence. 

(4) If both A and B divulge evidence against one another, they 
will both serve light sentences. 

In the conventional terminology of the Dilemma, each prisoner 
must choose between co-operating and defecting, with respect to his 
fellow prisoner. To defect means to divulge evidence; to co-operate 
means to refuse to divulge evidence. The game matrix can be com- 
structed as follows: 


å Neumann & Morgenstern, 1955, p.147. 
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Game 4.1 — The Prisoner's Dilemma 


where T>R>DP>S 
for prisoner A: C means co-operate, D means defect 
for prisoner B: c means co-operate, d means defect 


The numerical values of the payoffs may fluctuate in a given 
Prisoner's Dilemma, but their transitive ordering does not change. T 
stands for the temptation to defect; R, for the reward of mutual co- 
operation; P, for the punishment of mutual defection; S, for the so- 
called "sucker's payoft" .4 

As will be seen throughout Part Two, the dilemma admits of 
several facets of interpretation. 

The initial dilemma can be viewed as arising from the breakdown 
of the fundamental property of strictly-determined zero-sum games, 
when applied to certain non-zero-sum games; namely, the minimax 
criterion.” In game 4.1, a generalized Prisoner's Dilemma, the (P,P) 


cutcome resulting from mutual defection is, in effect, a saddle point 


of the matrix. 


Recall that, in a two-person zero-sum game with a saddle point, 
a player who seeks to maximize his payoff fares best by choosing that 
row (or column) which contains the saddle point, regardless of the 
other player's choice. In the Prisoner's Dilemma, however, this 


i This conventional notation is used e.g. by Rapoport & Chammah, 
1965, pp.334 et passim; by R. Axelrod, ~The Emergence of Cooperation 
Among Egoists', The American Political Science Review, 75, 1981, 
pp.306-18; by R. Axelrod & W. Hamilton, ~The Evolution of Coopera- 
tion’, Science, 211, 1981, pp.1390-6; among others. 


5 Å number of zero-sum game properties are violated in non-Zero- 
sum games; e.g. see Luce & Raiffa, 1957, pp.90-94. 
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property no longer holds. For, in game 4.1, if both players choose 
that row (and column) containing the saddle point, they attain the 
outcome (P,P), and thereby fail to realize the more mutually favour- 
able outcome (R.R). This situation cannot obtain in a zero-sum game 
with a saddle point. The non-zero-sum game differs critically, in 
that a player who seeks to maximize his own payoff is obliged to take 
the other player's possible choice into account (saddle points 
notwithstanding). 

In game 4.1, it can still be argued that player A fares better 
by defecting, in terms of possible payoffs to himself alone, regard- 
less of player B's choice. But if player Breasons similarly, then 
the resultant outcome is not the most mutually favourable outcome. 
Then again, if player Å risks co-operation, then he stands either to 
gain relatively less, or else to lose relatively more, than through 
defection, depending upon player B's choice. Thus each player must 
run the risk of incurring the most detrimental individual payoff if 
he wishes to achieve the most beneficial collective payoff. 

It is desirable to describe this situation in more formal 
terms. A useful way in which to do so is to represent the initial 
dilemma as a conflict between two principles of choice: dominance 
versus maximization of expected utility. 

The dominance principle operates as follows: choice X strongly 
dominates choice Y if and only if, for each game-state (joint out- 
come), Å prefers the consequences of X to those of Y. Choice X weakly 
dominates choice Y if: for each game-state, A either prefers the 
consequences of X to those of Yor is indifferent between them; and 
for some game-state or states, X prefers the consequence(s) of X to 
that (those) of Y. Two simple examples illustrate this principle. 


6 These principles are common to game theory am decision 
theory; e.g. see R. Nozick, ~Newcomb's Paradox and Two Principles of 
Choice’, in N. Rescher (ed.), Essays in Honour of Carl G. Hempel, D. 
Reidel, Dordrecht, 1969, pp.114-146. 


63 


Game 4.2 — Strong Dominance Game 4.3 — Weak Dominance 
"B B 
x på | x y 
X 4,1 9,5 X 4,1 9,5 
A A 
Y 3,2 8,6 Y 4,2 8,2 


In game 4.2, choice X strongly dominates choice Y for A (since 
4>3 and 9 > 8), while choice y strongly dominates choice x for B 
(since 5 >1 and6> 2). In Game 4.3, choice X weakly dominates 
choice Y for A (since 4 = 4 and 9 > 8), while choice y weakly domi- 
nates choice x for B (since 5 > 1 amd 2 = 2). 

In Game 4.1, the Prisoner's Dilemma, defection is strongly 
dominant for both A and B (since, for both prisoners, T > Rand P > 
5). Hence the dominance principle dictates that each prisoner should 
defect. But if both prisoners defect, the outcome (P,P) is mutually 
detrimental. 

The principle of maximization of expected utility was encoun- 
tered in Chapter Two. To re-iterate: the expected utility of a given 
row (or column) is the sum of the products of the utility of each 
game-state in that row (or column) and the respective probability 
with which that game-state obtains. Most generally, if a given row 
(or column) contains n states, and the utility of the i# state is U., 
and the i“ state obtains with probability p; then the expected 
utility of that row (or column) is 


B 
W= 2 (Up) 
=1 
To maximize expected utility, then, one chooses that row (or column) 
for which the EU is greatest. 

In Game 4.1, the respective utilities of each game-state are 
ordered (on the ordinal scale), but the probability that each game- 
state obtains has yet to be assigned. How are probabilities to be 
distributed among the game-states of the static Prisoner's Dilemma? 
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In the iterated case, it will be seen that probabilistic and causal 
dependencies arise which favour the frequency and likelihood of 
symmetric game states, either (RAM or (P,P). But since the static 
case is being treated as logically prior to the iterated case, it 
seems inappropriate to admit iterated criteria at this juncture. 
Since the static case is an isolated case, a posteriori probabilities 
(frequency distributions) are presumably unavailable to the priso- 
ners. Thus the prisoners would be obliged to assign probabilities on 
some a priori basis. 

For example, were player A to apply the principle of insuf- 
ficient reason. then he would assume that player B will co-operate or 
defect with equal probability (1/2). In that case, his expected 
utility of co-operation would be 


EU(O = (1/2) R + (1/2)5 
while his expected utility of defection would be 
AKD = (1/2) T + (1/2)P 


Since T > R and P > S, maximization of expected utility via the 
principle of insufficient reason suggests mutual defection. 

However, an argument can be made that a player should not apply 
said principle. By definition, the principle of insufficient reason 
States that 


"alternatives are always to be judged equiprobable if we 
have no reason to expect or prefer one over another." 


While a prisoner may have no reason to expect one joint outcome over 
another, he certainly has valid reason to prefer one joint outcome to 
another. Each prisoner will order the joint outcomes according to the 
payoffs they contain for him; e.g., for prisoner A: (7,5) > (RAM > 
(P,P) > (9,7. Given the expressibility of preferences, the principle 
of insufficient reason seems to rule itself out. | 

The prisoners have recourse to a more interesting—and arguably 
more appropriate—a priori probability distribution, which follows 


1 Weatherford, 1982, p.29. 
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from particular properties of the game matrix. The matrix of game 4.1 
has an equilibrium outcome at (P,P). An equilibrium outcome is one 


. from which neither player can shift without impair— 


ing his payoff, assuming that the other player does not 
shift." ~ 


Tha matrix also has a Pareto-optimal outcome at (R,R): 


"An outcome of a game is called Pareto-optimal if there 
is no other outcome in which both players get a larger 
payoff." 


The existence of equilibrium and Pareto-optimal outcomes may justify 
an a priori assumption of their probabilistic dependence. In other 
words, each prisoner may deem it likely that their joint decision 
will result in either an equilibrium or a Pareto-optimal outcome. 

In that case, each prisoner would weight the probabilities such 
that p(R,R) > p(S.T) and p(P,P) > p(T,S). In terms of individual 
choice, prisoner A would weight p(c/C) > p(d/C) and p(d/D > p(c/D), 
where p(c/C) means "the probability that prisoner B co-operates (c), 
conditional on the assumption that prisoner A co-operates (C)", and 
so forth.” Similarly, prisoner B would weight p(C/c) > p(D/c) and 
P(D/d) > p(Cc/d). 

Now prisoner A finds his expected utilities to be 


EO = p(c/O (MR + (1-p) (d/C) (9 
END) = (1-p)(c/D) (TD + pld/D) (P) 


where p > 1/2 


If prisoner A assumes complete probabilistic dependence, then 


8 A. Rapoport, M. Guyer, D. Gordon, The 2x2 Game, The University 
of Michigan Press, Ann Arbor, 1976, p.18. See also R. Weber, "No- 
ncooperative Games', Proceedings of Symposia in Applied Mathematics, 
24, 1981, pp.83-125. 


9 Rapoport et al, 1976, pp.18-19. 


1 mis notation is from R. Campbell, “Background for the Unini- 
tiated', in R. Campbell & L. Sowden (eds.), Paradoxes of Rationality 
and Cooperation, The University of British Columbia Press, Vancouver, 
1985, p.18ff. 
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P(c/C) = p(d/D = 1 and (1-p)(a/C) = (1-p) (c/D) = 0. Explicitly, 
then, A's expected utilities are 


= ECO) = R 
EUD) = P 


Since R > P, maximization of expected utility prescribes co-opera- 
tion. The argument is symmetric for prisoner B. Thus both prisoners 
co-operate, to their mutual benefit. 

Of course, if a prisoner assumes partial probabilistic depen- 
dence, then the general result is indeterminate. For instance, if for 
some reason prisoner A assumes p(c/C) = p(d/D) = x and pi d/O) = 
P(c/D) = (1-x), then his expected utilities are 


EKO = xR + (1585 
ED) = (1-x)T + xP 


Prisoner A co-operates if EU(C) > EU(D). For this to be the case, 


xR + (1-09 > xP + (1-x)T 
or 


(R-P)/(T-S) > (1/x)-1 (4.1) 


Consider the left hand side of inequality (4.1). Since T> R 
and S< P, the denominator is always larger than the numerator. Thus 
the left hand side of this inequality must always be smaller than 
unity. It approaches (but never reaches) the value of unity as an 
upper limit, in cases where R is almost as large as T and S is almost 
as large as P. 

Now consider the right hand side of inequality (4.1). It can 
take on a range of values for the permitted domain of x (0 < x< 1). 
AS x approaches zero, the right hand side blows up; as x approaches 
unity, it tends toward zero. At x = 1/2, the right hand side equals 
unity, which is the upper limit of the left hand side. 

So, for inequality (4.1) to be satisfiable, the value of x must 
exceed one-half. When it does so, the right hand side is less than 
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unity, and the inequality. can be satisfied by appropriate values of 
T, R, P, and 5. Thus, if the maximization of expected utilities is to 
prescribe co-operation via the rule of partial probabilistic depen- 
dence, the conditional probabilities of mutual co-operation, p(c/O, 
and of mutual defection, p(d/D), must exceed the critical value of 
1/2. When they do so, the principle of maximizing expected utilities 
may prescribe co-operation, depending on the particular payoffs. But 
when the conditional probabilities do not exceed the critical value 
of one-half, the principle always prescribes defection, regardless of 
the payoffs. 

These considerations can be illustrated graphically, where the 
natural logarithms of both sides of inequality (4.1) are plotted 
against the permitted: domain of x. 


Graph 4.1 REGIONS OF CO-OPERATION AND DEFECTION 
Maximization of Expected Utility with 
Partial Probabilistic Dependence 


å f(x) = in((1/x)-1] 


region of 


unconditional defection 


0.1 0.2 0.3 i . . . 0.8 0.9 1 


x = p(c/C) = p(d/D) —*- In[(R-P)/(T-S)] 


Graph 4.1 delineates regions of co-operation, and of defection. 
The region of unconditional defection is bounded above by the 
x-axis for 0 < x < 1/2. The graph depicts a previous algebraic 
result, that inequality (4.1) cannot be satisfied in this domain. In 
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other words, maximizing expected utility with partial probabilistic 
dependence of less than one-half prescribes defection for all values 
of T, R, Pam S (such that T> R>? P>? 9. 

The region of conditional defection is bounded above by the 
curve f(x) = In{(1/x)-1], for 1/2 < x< 1. In this domain, maximizing 
expected utility prescribes defection if inequality (4.1) is not 
satisfied, i.e. if (RP)/(T-S) < (1/x)-1. For any partial probabli- 
lity x, in this domain, the result depends upon the particular 
payoffs of the given game. 

The region of conditional co-operation is bounded above by the 
x-axis, and below by the curve f(x) = 2lni(1/x)-1], for 1/2 < x< 1. 
Maximization of expected utility prescribes co-operation if ine- 
quality (4.1) is satisfied, i.e. if (R-P)/(T-5) > (1/x)-1. Note that 
the area of the region of conditional co-operation increases as x 
approaches unity. This area is proportional to the number of possible 
values of T, R, P and S for which inequality (4.1) is satisfied. 

At x= 1, f(x) is undefined, since unity is that value of x for 
which partial probabilistic dependence becomes complete probabilistic 
dependence. The area in this region increases without bound as x gets 
very close to unity, andthe graph depicts a previous algebraic 
result: that in the case of complete probabilistic dependence, 
maximization of expected utility prescribes unconditional co-opera- 
tion, for all values of T, R, P and S (such that T> R>? P>? 9. 

In so far as the dilemma confronting the prisoners arises from 
diverging dictates of two decision-theoretic principles of choice, 
the situation can be summarized as follows. 

For each prisoner, defection strongly dominates co-operation. 
The dominance principle dictates that each prisoner fares better by 
defecting than by co-operating, no matter what the other prisoner 
does. But if both prisoners defect, they achieve an equilibrium 
outcome, which is mutually detrimental. 

On the other hand, the existence of equilibrium amd Pareto- 
optimal outcomes in the matrix may incline each prisoner to maximize 
his expected utility. If both adopt the rule of complete probabilis- 
tic dependence, then both co-operate, amd they achieve a Pareto- 
optimal outcome, which is mutually beneficial. If both adopt the rule 


69 


of partial probabilistic dependence, then the joint outcome depends 
upon their respective probability weights and the given payoffs of 
the game. 

The inner workings of this dilemma are not trivial, even in the 
static case under consideration. Although the two decision-theoretic 
principles may prescribe conflicting choices, they do not do so 
unequivocally. An inner problem is embedded in the calculus of each 
principle, which prevents a rational player from adopting either 
unreservedly. Briefly stated, these problems are: 

(1) If the dominance principle is rational for each prisoner, 
why does its mutual adoption result in a detrimental joint outcome? 

(2) If the principle of maximizing expected utility is rational 
for each prisoner, is, the associated rule of probabilistic dependence 
to be complete, or partial? And, if the principle is adopted with 
partial probabilistic dependence, how does a rational prisoner assign 
the corresponding probability weights? 

But in asking these two questions, one begs a third: 

(3) What, if anything, constitutes "rational" choice in the 
Prisoner's Dilemma? The current working definition, that it is 
rational to choose the alternative one prefers, can lead to any of 
the four joint outcomes. In that case, the prisoners may as well flip 
coins as apply decision theory. Since the working definition of game- 
theoretic rationality cannot distinguish between individually and 
mutually beneficial, or detrimental, outcomes, one might posit a 
criterion of game-theoretic meta-rationality: to be "meta-rational" 
is to be aware of the deficiency of the game-theoretic concept of 
rationality as it stands. 

According to this hypothetical criterion, Rapoport is highly 
meta-rational. In his view: 


"Either the concept of rationality is not well-defined in 
the context of the non-negotiable non-zero-sum game; or 
if the definition of rationality in the context of the 
zero-sum game is applied to the "solution" of some non- 
zero-sum games, the results are paradoxical." 


In the case of the Prisoner's Dilemma, it seems that Rapoport's 
disjunction is actually a conjunction. The "paradox", in this case, 


ll Rapoport, 1966, p.142. 
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arises because each prisoner has a strongly dominant choice (defe— 
ction) that leads to a mutually-detrimental result. Is the soundness 
of the dominance principle suspect? Not necessarily. The core prob- 
lem, which Rapoport identifies, lies in the application of zero-sum 
rationality to a non-zero-sum game. 

Rapoport's insight provides answers to the first two questions 
by addressing itself to the third. 

In applying the dominance principle, prisoner A chooses the set 
of outcomes that is best for him. In a two-person zero-sum game, the 
best set of outcomes for A is also the worst set of outcomes for B, 
since B must always forfeit exactly what A what gains (and vice- 
versa). If A prefers to maximize his payoff, and if one choice 
dominates another, then A is rational to make the dominant choice. 


Game 4.4 — The Prisoner's Non-Dilemma 


B 
Cc d 
C R,-R -T,T 
A 
D T-T P,-P 


where T> R? P 


Game 4.4 represents an attempt to impose a zero-sum condition 
upon the Prisoner's Dilemma. For prisoner A, defection is strongly 
dominant, since T > Rand P > -T. For prisoner B, defection is also 
strongly dominant, since T >-Rand -P > -T. Thus both prisoners 
defect. But in this case, the outcome (P,-P) is not mutually detri- 
mental; rather, mutually optimal. Why? Because (P,-P) is a saddle 
point of the matrix. If A defects, he gains at least P utiles; if B 
defects, he loses at most P utiles. Since mutual defection leads to 
minimax, a zero-sum Prisoner's Dilemma presents no dilemma to the 
prisoners. 

In the non-zero-sum Prisoner's Dilemma, however, mutual defec- 
tion leads to an outcome that is not mutually optimal. Why? Because 
the zero-sum criterion of rationality prescribes defection to each 
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prisoner. According to this criterion, if prisoner A prefers to 
maximize his gains and minimize his losses, and if a dominant choice 
exists, then he should make that choice. Hence he is rational to 
defect. And so with B. But this criterion originates in the context 
of a zero-sum game, in which, by definition, the algebraic sum of the 
joint payoffs of any outcome is always zero. In a non-Zero-sum game, 
however, this constraint vanishes; differences between algebraic sums 
of joint payoffs now exist. These differences must be taken into ac- 
count, since they can generate a Pareto-optimal outcome. 

Game 4.4 (the Prisoner's non-Dilemma) has an equilibrium 
outcome at (P,-P). Neither prisoner can shift from it without impair- 
ing his payoff, assuming that the other player does not shift. If 
either prisoner prefers the equilibrium outcome, he applies the 
dominance principle, and obtains, at worst, his preference. Dominance 
is effective in zero-sum games because the criterion of rationality 
is workable in zero-sum games. The criterion in turn is workable 
because, in zero-sum games, every outcome is Pareto-optimal ." 

Game 4.1 (the Prisoner's Dilemma) also has an equilibrium 
outcome at (P,P). But the criterion of zero-sum rationality, which 
demands only that a player choose the alternative he prefers, fails 
to guarantee Pareto-optimality in this non-zero-sum case, because the 
equilibrium outcome (P,P) is no longer Pareto-optimal. 

One can now appreciate the cogency of Rapoport's differentia- 
tion between individual and collective rationality. Individual 
rationality is applicable in zero-sum games. But in non-zero-sum 
games, collective rationality must be applied, in order that the 
players do not pre-empt a Pareto-optimal outcome by exercising 
individually rational choices. A working definition of collective 
rationality demands that a player attempt to achieve a Pareto-optimal 
outcome, if one exists. At the same time, a player who is collective— 
ly rational must be able to protect himself—in so far as a given 


k In any zero-sum game, every outcome satisfies the condition 
of Pareto-optimality: namely, that no other outcome contains larger 
payoffs for both players. This condition, being universally true in 
zero-sum games, retains little significance in them. 


13 Rapoport (ed.), 1974, p.4. 
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game allows—from a player who is individually rational, or otherwise 
irrational. 

This enquiry does not attempt to formulate a definition of 
collective rationality that is workable across the broad spectrum of 
non-zero-sum games. It does, however, attempt to realize a more 
limited objective; namely, an implementation of Rapoport's concept of 
collective rationality in the context of the Prisoner's Dilemma. Thus 
far, the attempt provides an answer to question (1) above: the 
dominance principle leads to a mutually-detrimental outcome because, 
although individually rational, it is not collectively rational. 

Next, one seeks answers to questions (2) and (3). It is pos- 
sible to formulate a working definition of collective rationality 
that answers these questions simultaneously. Suppose that prisoner A 
is collectively rational if 

(1) he elects to maximize his expected utility, and 

(ii) he adopts the rule of either complete or partial probabi- 
listic dependence, assigning to p(c/C) amd p(d/D) the probability 
that prisoner B is collectively rational. 

It can be immediately contested that condition (ii) of this 
proposed definition is impredicative; nonetheless, given that the 
type of rationality under consideration is not of the individual 
kind, it may be permissible in these unusual circumstances to define 
the collective aspect in terms of the collective itself. One may put 
this objection on one side, and see whether the definition can 
counter it in operation. 

Let prisoner A be collectively rational, according to condi- 
tions (i) and (ii). Now suppose the probability of B's collective 
rationality is unity. In that event, A maximizes his expected utility 
with p(c/C) = p(d/D) = 1, which is the case of complete probabilistic 
dependence. Consequently, A co-operates. But Bis also collectively 
rational, and the probability of A's collective rationality is also 
unity. So, according to conditions (i) and (ii), B maximizes his 
expected utility with p(C/c) = p(D/d) = 1, and consequently B too co- 
operates. So mutual collective rationality, on these terms, leads to 
the desired outcome of Pareto-optimality. 
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Now let prisoner A be collectively rational, again according to 
conditions (i) and (ii), and suppose the probability of B's collec- 
tive rationality is zero. A maximizes his expected utility with 
pic/0 = p(d/D). = 0, which lies in the region of unconditional 
defection. So A defects. But B is not collectively rational, and may 
defect upona whim. In that case, A protects himself against B's 
individual rationality, and against any other form of irrationality 
that leads B to defect. If B's irrationality leads him to co-operate, 
for a bizarre or capricious reason, then A fares even better by 
defecting. 

Now let prisoner A be collectively rational, again according to 
conditions (i) and (ii), and suppose the probability of B's collec- 
tive rationality lies, between zero and unity. If said probability is 
less than or equal to one-half, then A defects unconditionally. If it 
is greater than one-half, then A's maximization of expected utility 
lies in the region of conditional co-operation or defection. A either 
co-operates or defects, depending on the actual payoffs involved. In 
general, the greater the probability that prisoner B is collectively 
rational, the greater the number of cases in which collectively 
rational prisoner A will choose co-operation. 

This working definition of collective rationality answers 
questions (2) and (3), and seems to overrule the objection of im- 
predicativity. It allows two collectively rational prisoners to 
achieve a Pareto-optimal outcome, and also affords a measure of 
protection to a collectively rational prisoner whose fellow-prisoner 
is not collectively rational. 

Unfortunately, the static Prisoner's Dilemma is not so handily 
resolved. The proposed definition of collective rationality, while 
quite workable in theory, encounters a formidable barrier in prac- 
tice. There is simply no analytic method, inthe static mode, by 
which one prisoner can ascertain the probability of the. other's 
collective rationality. A prisoner who wishes to do so falls into one 
of two broad streams of probabilistic thought: the å priori, and the 
a posteriori. The a posteriori probabilistic prisoner cannot be in 
possession of a frequency distribution of the other prisoner's 
previous choices (made in other dilemmas), since the static model 
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represents a single, isolated case. Nor can the a priori probabilis- 


tic prisoner be made aware of the other prisoner's current delibera- 
tions or intentions since, according to the ground rules of the 
model, the prisoners are held incommunicado. 

It would seem that one prisoner's evaluation of the probability 
of the other prisoner's collective rationality is a matter of guess- 
work. As such, a prisoner may make a grossly inaccurate assessment, 
with disastrous results for either himself or his fellow-prisoner. 
And if both prisoners are collectively rational, but both incorrectly 
assess the other's probability of being such as less than one-half, 
then both prisoners defect, to their mutual detriment. Unless the 
collectively rational prisoner is able to finda reliable way to 
ascertain the probability of the other's collective rationality, then 
his own collective rationality amounts to no more than a beneficial 
intention. While a beneficial intention may be an estimable factor in 
the resolution of conflict generally, it is plainly susceptible to 
misdirection in the static Prisoner's Dilemma, where it can prove as 
inimical, to either prisoner, as a hostile predisposition. Again, the 
Prisoner's Dilemma resists an infallible resolution. 

Game-theorists who are unwilling to be confounded by the 
dilemma have brought no small ingenuity to bear upon the problem. Two 
significant proposed resolutions are examined in the next two chap- 
ters. The model, however, exhibits a disquieting, Hydra-like proper- 
ty: the resolution of one dilemma seems to engender the appearance of 
another. 

For example, Rapoport's notion of collective rationality has 
drawn the following criticism: Davis argues that if each prisoner is 
concerned with the joint outcome, then the Prisoner's Dilemma ceases 
to be å dilemma." Suppose one alters the ground rules, and permits 
the prisoners to signal or even to discuss their intentions. In other 
words, one changes the model from a non-co-operative to a co-opera- 
tive game. If the prisoners collude, and make a pact not to defect, 
then the dilemma appears to vanish. 


4 Davis, 1970, pp.101-102. 
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Not so, according to Rapoport and Chammah, who anticipated and 
countered the criticism: 


"It is clear, however, that if the pact is not enforce— 
able, a new dilemma arises. For now each of the prisoners 
faces a decision of keeping the pact or breaking it. This 
choice induces another game exactly like the Prisoner's 
Dilemma, because it is in the interest of each to, break 
the pact regardless of whether the other keeps it." 


It is clear that the conflict within the Prisoner's Dilemma may 
be transposed from one set of issues to another. In this chapter, a 
transposition was effected from conflicting principles of choice to 
conflicting concepts of rationality. Similarly, Davis's criticism and 
Rapoport's reply effect a transposition from a conflict between 
dominance and utility to a conflict between temptation and integrity. 

It is also clear, however, that the conflict is not resolved 
merely by virtue of being transposed. 


B Rapoport & Chammah, 1965, p.25. 
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Chapter Five 


A Resolution Via Newcomb's Paradox 


Nozick's publication of Newcomb's Paradox, and his treatment of 
it, engenders ongoing debate in dgame-theoretic, decision-theoretic, 
and philosophical literature .! In some respects, Newcomb's Paradox 
and Prisoner's Dilemma pose similar problems; in other respects, 
quite different ones. Consideration of the similarities led both 
Brams and Lewis to revelations of relevance to this enquiry; namely, 
that the static Prisoner's Dilemma can be viewed as constituting a 
particular case of Newcomb's Paradox. This view is relevant because 
Brams also gives an attempted resolution of the paradox. One can 
enquire whether the resolution seems sound and, mutatis mutandis, 
whether it perforce applies to the particular case of the dilemma as 
well. 

To begin with, then, let Newcomb's demon be introduced: 


"Suppose a being in whose power to predict your choices 
you have enormous confidence. . .You know that this being 
has often correctly predicted your choices in the past 
(and has never, so far as you know, made an incorrect 
prediction about your choices), and furthermore you know 
that this being has often correctly predicted the choices 
of other people, many of whom are similar to you, in the 
particular situation to be described below. One might 
tell a longer story, but all this leads you to believe 
that almost certainly this being's prediction about your 
choice, in the situation to be discussed will be cor- 
rect." 


The player then finds himself in this situation. Two boxes, Bl 
and Æ, are placed in front of him. Bi is trarsparent; Æ, opaque. Bl 
contains £1,000. 2 contains either £1,000,000 or nothing, depending 
upon what Newcomb's demon predicts about the player's upcoming 
choice. The player must choose between taking either the contents of 


l Nozick, 1969, pp.114-46. 


2 S. Brams, ~Newcomb's Problem and Prisoner's Dilemma', Journal 
of Conflict Resolution, 19, 1975, pp.596-612; and D. Lewis, ~Priso— 
ner's Dilemma is a Newcomb Paradox', Philosophy and Public Affairs, 
8, 1979, pp.235-240. 


3 Nozick, 1969, p.114. 
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both boxes, or the contents of BR only. If the being predicts that 
the player will choose the contents of both boxes, it does not place 
£1,000,000 in B2. If the being predicts that the player will choose 
the contents of .& only, it places £1,000,000 in &. 

The play unfolds in a strict sequence. First, the being makes 
its prediction. Second, according to its prediction, it places either 
nothing or £1,000,000 in Æ. Third, the player makes his choice. The 
game matrix is as follows. 


Game 5.1 — Newcomb's Paradox 


being 
predicts predicts 
, & only BA & gr 
chooses EM £0 
B2 only 
player 
chooses EM + ET ET 
A&R 


where £M = £1,000,000 and £T = £1,000 


As in the Prisoner's Dilemma, one encounters a conflict between 
two principles of choice: dominance, and maximization of expected 
utility. 

Choosing both boxes strongly dominates choosing B2 only, since 
EM + £T > &M and £T > £0. [This remains true despite the arbitrari- 
ness of the utility of money, as long as said utility is taken to be 
any transitive function of the amount; i.e. if X> Y, then U£EX > 
U(£Y).] The dominance principle dictates that, no matter what the 
being predicts, the player fares better by choosing both boxes. 

Then again, the player's expected utilities of choosing BR 
only, and of choosing both M and Æ are, respectively, 


EU(B2) = p(B2)U(EM) + (1-p) (Al & 22) U(E0) 
EUA & B2) = (1-p) (B2) UEM + ET) + p(BL & B2)U(ET) 
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where p(Æ) is the probability that, if the player chooses #2, the 
being has correctly predicted this choice; and so forth. 

If one takes the utility of money to be proportional to the 
base ten logarithm of the amount, and if one assumes the utility of 
£0 to be nothing, then one has 


EU(B2) = 62 
EUA & B2) = 6(1-2) + 3z 


where z is the probability that the being has correctly predicted the 
player's choice. According to this utility assignment, EUR) > 
EWA & B2), if z > 2/3. Thus, the player should choose only box two 
if the being's predictive success rate exceeds two-thirds. 

It should be noted that the selection of a monetary utility 
function has a pronounced effect upon the overall expected utilities. 
If, for example, one now takes the utility of money to be proportion- 
al to its actual amount, then one has 


EB) = 10°z 
EUA & B) = (10° + 109 (1-2) + 10%2 


where z is once again the probability that the being has correctly 
predicted the player's choice. In this case, EU BØ) > EXA & BR) if 
Z > 1001/2000. Thus, the player should choose only box two if the 
being's predictive success rate exceeds one-half (by more than one 
two-thousandth). This substantial relaxation of the probabilistic 
demand results from the selection of a linear utility function. 

Notwithstanding the range of probabilistic demands made pos- 
sible by the arbitrariness of the utility of money, one can safely 
infer from Nozick's description that the being's predictive success 
rate is such that the expected utility of choosing only box two is 
much greater than that of choosing both boxes. Hence the principle of 
maximizing expected utility suggests that the player choose the 
contents of box two only. 

It is also clear from Nozick's description that a player is 
able to make use of an a posteriori probability calculus, if he so 
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wishes. Newcomb's demon apparently has unlimited funds to disburse if 
need be, and a player may avail himself of a long series of observa- 
tions in order to ascertain the limiting frequency with which the 
being makes correct predictions. The player who elects to maximize 
his expected utility in this model has therefore a more objectively 
reliable method of assigning probabilities than in the static Priso- 
ner's Dilemma. 

But this advantage is negatively-compensated-—if not reduced to 
irrelevancy—by another circumstance, peculiar to Newcomb's paradox. 
If one re-considers the strict temporal order of the moves (first, 
the being's prediction; second, the being's consequent placement or 
non-placement of £M in box two; third, the player's subsequent 
choice), one detects an implicit flaw in the argument for maximizing 
expected utility. 

Suppose that the first two moves have been made; i.e. that the 
being has made its prediction, and has acted upon it. Now the player 
must choose either the contents of box two alone, or the contents of 
both boxes. It is most certainly the case that box two presently 
contains either £M, or nothing. The contents of box two cannot now be 
affected by the player's choice, and the player obtains the contents 
of box two regardless of his choice. If the player chooses both 
boxes, he is then guaranteed of obtaining no less than £7; whereas if 
he chooses box two only, and if the being has predicted incorrectly, 
then the player obtains nothing. 

This is not simply a restatement of the dcminance principle, 
for the following reason. An å posteriori probabilist may well object 
to the foregoing argument, on the ground that the being's observed 
frequency of predictions is, let one suppose, 99.9999% correct. If 
the player now chooses both boxes, the being will almost certainly 
have predicted his choice, and will have placed nothing in box two. 
But if the player now chooses box two only, the being, by the same 
token, will have predicted this choice with the same high degree of 
accuracy, and will have placed £M in box two. The player should 
therefore choose box two only. 

The a posteriori probabilist's objection is countered by the 
assertion that, while the being's prediction and the player's choice 
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are evidently probabilistically dependent (even to the extent that 
the partial dependence approaches complete dependence), there is 
absolutely no causal dependence between the two. The being's predic- 
tion has no causal influence over the player's choice; in conse- 
quence, the being's prediction can be incorrect. And neither can the 
player's choice have any causal influence over the being's predic— 
tion; for that would entail a violation of the temporal succession of 
events. In other words, if the being has predicted incorrectly, then 

(i) if box two now contains nothing, then the player's choice 
of box two only cannot cause the being to place £M therein: and 

(ii) if box two now contains £M, then the player's choice of 
both boxes cannot cause the being to remove the £M therefron. 
Bolstered by the assertion of causal independence, a player may be 
tempted to choose the contents of both boxes. 

50 Newcomb's paradox embodies a conflict not only between the 
principles of dominance and maximization of expected utility, but 
also between corollary arguments of complete causal independence and 
near-complete probabilistic dependence, respectively. 

Nozick put Newcomb's problem to a great many people, and 
elicited their choices as hypothetical players. He found: 


"To almost everyone it is perfectly clear and obvious 
what should be done. The difficulty is that these people 
seem to divide almost evenly on the problem, with large 
numbers, thinking that the opposing half is just being . 
silly." 


Given such a response, Newcomb's problem may justly bear the mantle 
of a paradox. While opinion may divide as evenly in the Prisoner's 
Dilemma, either half can appreciate why the other chooses as it does, 
without necessarily accusing it of irrationality (or silliness). 
Conflict of choice in the Prisoner's Dilemma can be understood as a 
conflict between individual and collective rationality, and alterna— 
tively as uncertainty ina collectively rational prisoner's assess- 
ment of the other prisoner's rationality. Perhaps Newcomb's problem 
is paradoxical because, among other reasons, it brings the zero-sum 
concept of rationality into a non-zero-sum game (in which the being, 


4 Nozick, 1969, p.117. 
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having an infinite supply. of funds, gains or loses nothing) without 
having to re-define rational play. 

In consequence, every player can exercise individual rationa- 
lity with impunity, and seek the maximum possible gain. There are no 
collective outcomes to be weighed; a player can neither exploit the 
being, nor be exploited by it. A player has nothing to lose, and 
stands to gain substantially. In these respects, Newcomb's problem 
differs patently from the Prisoner's Dilemma. If å multitude of 
players can be thus described, it seems paradoxical indeed that their 
choices should manifest the same divergence, according to the same 
principles, as in the Prisoner's Dilemma. 

Given these critical differences between the two models, one 
seeks an explanation . for the similarities between the dilemmas that 
the players face. As intimated earlier, the Prisoner's Dilemma can be 
regarded as a particular case of Newcomb's paradox. Explicitly, both 
Brams and Lewis have argued that the Prisoner's Dilemma can be viewed 
as two interacting Newcomb's paradoxes.” 

To appreciate this perspective, one must first grant that the 
player in Newcomb's paradox is playing a game, in effect, against a 
state of nature .* This follows from the strict sequence of moves, 
which begins with the being's prediction, and continues with its 
placement, or non-placement, of £M in box two. When the player makes 
his choice, the possible outcomes are already halved, from four to 
two, by the being's previous moves. From the being's point of view, 
the player is facing a state of nature which has only one possible 
pair of outcomes, because the being has already predicted one of the 
player's two possible choices. Before the player actually makes his 
choice, the being then perceives the game in one of two ways, depend- 
ing upon what it has predicted: 


> Brams, 1975; & Lewis, 1979. 


This point, and its decision-theoretic ramifications, are 
spelled out in some detail by Nozick, 1969. 
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Game 5.2 — Being's View Game 5.3 — Being's Alternative View 
being being 
predicted predicted 
Æ only AS R 
will choose EM will choose £0 
B2 only B2 only 
A å A 
will choose EM + ET will choose ET 
BL & B2 A & mM 


The player (A) also realizes this, but the nature of the game 
prohibits him from knowing which pair of outcomes actually remains, 
until after he will have made his choice. As such, Newcomb's paradox 
is played against a state of nature. 

In the Prisoner's Dilemma, the very absence of a strict se- 
quence of moves provides a convenient justification for viewing the 
game as a dual Newcomb's paradox. Neither prisoner is concerned with 
the order in which the individual choices are made, because all joint 
outcomes are insensitive to temporal permutations of choice. Whether 
prisoner A makes his choice first, and B second, or vice-versa, has 
no bearing on the outcome. This circumstance arises because the 
choices in both models under consideration are causally independent. 

For the sake of contrast, one may briefly consider a causally 
dependent game against a state of nature. Suppose two collectors wish 
to acquire a given work of art at an auction. Suppose also that 
collector A can afford a maximum bid of £1,000; collector Ð, £5,000. 
Fach is willing to bid his respective maximum. With respect to the 
work of art, both collectors are playing a game against a state of 
nature. Though perhaps unknown to both collectors, the outcome is 
determined before the bidding commences: A will neither part with 
money nor acquire the work, and B will both part with money and 
acquire the work. But for B, the cost of the outcome is not predeter- 
mined; that depends upon who bids first. 

Should A enter an opening bid of, say, £400, B might reply with 
£500. If bidding continues inthis fashion, A will soon bid his 
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maximum of £1,000, and Bwill acquire the work for, say, £1,100. On 
the other hand, should B enter an opening bid of, say, £2,000, he 
would unknowingly pre-empt A, and acquire the work at a higher cost. 
In general, when the players' choices exert mutual causal influence, 
the temporal sequence of play must be taken into account J 

Returning to the Prisoner's Dilemma, one can imagine that, from 
prisoner A's point of view, prisoner B may be assumed to have con- 
veyed his choice to the authorities. Then A's situation is similar to 
that of the player in Newcomb's paradox: from the authorities’ point 
of view, A is now playing against a state of nature, in which only 
one pair of outcomes is attainable. Although Å realizes this, he is 
prohibited from knowing which of the original two pairs remains, 
until he will have made his choice. The situation is completely 
Symmetric with respect to prisoner B, who may assume, without con- 
tradiction, that prisoner A has already conveyed his choice to the 
authorities, and so forth. 

Obviously, from everyone's point of view, it is logically 
impossible for both prisoners to convey their choices first; they 
must do so either one after another, or with approximate simultane— 
ity. But it is logically possible for each prisoner to assume that 
the other has done so first, in which case they find themselves ina 
dual Newcomb's paradox. In that original paradox, however, a player 
can fare no worse than no gain; whereas in this dual paradox, either 
prisoner can incur a loss of liberty. 

It seems advantageous to regard the Prisoner's Dilemma as a- 
particular case of Newcomb's paradox, if only because this view 
affords one explanation as to why the models exhibit both radical 
differences (in criteria of rationality) and pronounced similarities 
(in divergence of rational choice). 

But this view offers a second advantage, of arguably greater 
moment than the first. Newcomb's paradox has been reformulated ina 
way that eliminates one of the two principles of choice from con- 


1 E.g. see also M. Bar-Hillel & A. Margolit, ~Newcomb's Paradox 
Revisited', The British Journal for the Philosophy of Science, 23, 
1972, pp.295-305. 
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Sideration. If the reformulation of the paradox is sound, then it 
ought to be applicable to the dilemma as well. 
Brams credits Ferejohn with the insight of reformulating 


Newcomb's paradox.® 


Essentially, the being's desiderata are trans- 
posed from the game-theoretic, to the decision-theoretic variety. The 
transposition is effected in the following way: instead of defining 
the game-states in terms of what the being predicts about the play- 
er's choice (i.e. that the player will choose either the contents of 
box two only, or the contents of both boxes), one defines them in 
terms of the eventual astuteness of the prediction itself (i.e. that 


the being's prediction proves either correct, or incorrect). 


Game 5.4 — Newcomb's Paradox Reformulated 


being 
predicted predicted 
correctly incorrectly 
Will have chosen EM £0 
B2 only 
player 
will have chosen ET EM + ET 
Bl & R 


The outcomes themselves remain unaltered. But the choice of 
both boxes no longer dominates the choice of box two only. The 
principle of dominance is thus eliminated from contention, while 
maximization of expected utility yields the same results as in Game 
5.1. 

But note the changes in tense. In Game 5.1, both the being's 
prediction and the player's choice are couched in the present tense 
(with the understanding that the prediction preceeds the choice). In 
Game 5.4, the astuteness of the being's prediction can be assessed 
only after the player will have made his choice. Thus, in order to 
construct the reformulated game matrix, which presumably aids him in 
making a choice, the player must imagine that he has already made a 


8 prams, 1975, pp.599-600. 
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choice. Though he cannot attain an outcome before making a choice, he 
knows full well what the possible outcomes are, so is free to imagine 
the consequences of future choices, as if they were already made, and 
to juxtapose them with the subsequent correctness or incorrectness of 
the being's (previous) prediction. 

It does not seem unsound to alter the temporal perspective from 
which the player views the game, provided that one does not tamper 
with the strict temporal sequence of the moves. (This point will 
shortly become relevant in another context also.) 

One hastens to add that, although in this reformulation the 
principle of mazimizing expected utility prevails over dominance by 
default, the player is not obliged to adopt either principle. Whether 
playing Game 5.1 or 5.4, the player may arrive at his choice by 
flipping å coin, consulting an oracle, or by any other means he deems 
fit. But a player wishing to employ an analytical method of choice 
may be caught by the paradox in Game 5.1, and be convinced by the 
transposition in Game 5.4. 

Prams is not slow to apply this reformulation to the Prisoner's 


Dilemma .? 


He leaps directly from the reformulated Newcomb's paradox 
to the reformulated dilemma, and takes as a premise that both prison- 
ers will realize that the dominance principle now gives way to the 
maximization of expected utility. However, as Brams realizes, this is 
still a far cry from the elicitation of mutual co-operation. Brams is 
able to show that each prisoner co-operates only if the probability 
of the correctness of his prediction of the other player's choice is 
sufficiently high. This, in turn, invites comments from Rapoport on 
the manifest difficulties involved in evaluating Bayesian probabili- 
ties." 

It is not necessary to recapitulate Brams's argument in order 
to appreciate that the reformulation of the Prisoner's Dilemma, 
compelling though it is, does not resolve the conflict. An effective 


(though by no means unique) way to represent the reformulated dilemma 


à Ibid., pp.603FF. 


9 A. Rapoport, ~Comment on Brams's Discussion of Newcomb's 
Paradox', Journal of Conflict Resolution, 19, 1975b, pp.613-19. 
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was developed, in another guise, in the preceeding chapter. Suppose 
that both prisoner A and prisoner Bare collectively rational. Then, 
in order to reformulate the game matrix, either prisoner need only 
ask "Can the other prisoner correctly predict my choice?". 

Recall the criteria of collective rationality. Prisoner A is 
said to be colletively rational if 

(i) he elects to maximize his expected utility, and 

(ii) he adopts the rule of either complete or partial probabi- 
listic dependence, assigning to p(c/C) amd p(d/D) the probability 
that prisoner B is collectively rational. 

Then, for example from A's point of view, the following matrix 
obtains: 


Game 5.5 — Prisoner's Dilemma Reformulated 
A's belief about B 


Bwill correctly B will incorrectly 
predict A's choice predict A's choice 
C R, R ST 
A 
D P,P T,5 


Defection no longer dominates. Since B is collectively ration- 
al, A knows that if B predicts A will co-operate, then B will co- 
operate; and if B predicts Ä will defect, then B will defect. If A 
assigns sufficiently high probability to B'S ability to predict 
correctly, then A co-operates; otherwise, he defects. The situation 
is symmetric for B. But the reformulation of the matrix leads to the 
same problem posed by the transposition in the preceeding chapter; 
namely, that no analytical method exists by which either prisoner can 
assign said probabilities in the static mode. 

Thus far, one has not encountered any ambiguity about the 
actual mechanics of the calculus of maximizing expected utility; one 
has simply failed to find an objective method for assigning the 
probability distribution which forms a necessary component of that 
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calculus. Rapoport's concept of collective rationality, and Brams's 
attempted resolution of the Prisoner’s Dilemma via the reformulation 
of Newcomb's paradox, both prescribe the maximization of expected 
utility. That ana priori probabilist cannot readily implement the 
calculus in the static mode, does not negate the soundness of the 
prescription itself. 

But Nozick's treatment of the maximization of expected utility, 
which is consistent with Rapoport's, Brams’s, this enquiry's, and 
many others, is subjected to a vigorous attack by Levi.) Since the 
principle has theoretical value in the static Prisoner's Dilemma (and 
will be of demonstrable empirical value in the iterated Prisoner's 
Dilemma), Levi's objection ought to be examined. If Nozick's applica- 
tion of the principle to Newcomb's paradox is flawed, then the flaw 
should be exposed, lest it be incoporated into applications of the 
identical principle to the Prisoner's Dilemma. 

Levi presents three different cases, in each of which Newcomb's 
demon has made at least 1,000,100 total predictions of the player's 
choice. (It is irrelevent to the problem whether 1,000,100 players 
have each played one game, or one player has played 1,000,100 games, 
and so forth.) The three different distributions of predictions and 


choices are as follows: 


5.6 — Newcomb's Paradox, Levi's Case 1 


being 
predicts BM predicts M & Æ 
chooses 22 900 , 000 10 
player 
chooses Bi & B2 100,000 90 


1 I. Levi, “Newcomb's Many Problems', Theory and Decision, 6, 
1975, pp.161-175. 


V Ibid., p.165. 
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If the player wishes to maximize his expected utility, he asks 
two questions: "What is the frequency with which the being correctly 
predicted the player's choice of only box two?", and "What is the 
frequency with” which the being correctly predicted the player's 
choice of both boxes?". The answer to the first question is: 
900,000/1,000,000 or 9/10; the answer to the second is: 90/100 or 
9/10. In this case, the a posteriori probability that the being 
predicts correctly is 9/10. 

Now the player can calculate his expected utilities. The 
expected utility of choosing only box two is 


EXE) = p(Go/Py) UEM + (1-D) (Cy / Rep ) UEO) 


where pi Co Fe) means "the probability that the player chooses only 
box two, if the being predicted his choice of only box two"; and 
(1~p) (Gy Ayn) means "the probability that the player chooses only 
box two, if the being predicted his choice of both boxes." 

For simplicity, assume the utility of money to be proportional 
to the amount. Then 


EU(B2) = (9/10) (EM) = £900,000 


Similarly, the player calculates his expected utility of 
choosing both boxes: 


EU(BL & B2) = pl Gyep/Pp) U(EMET) + (1-2) (Corro P pep HED 


where p( Co” Hp) means "the probability that the player chooses both 
boxes, if the being predicted his choice of only box two"; and 
(1-p) Gun Aun) means "the probability that the player chooses both 
boxes, if the being predicted his choice of both boxes." Then 


EUA & B2) = (1/10) (EMET) + (9/10) (ET) = £101,000 


Hence, maximizing expected utility prescribes that the player 
choose box two only. 
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Levi's other cases are as follows: 


Game 5.7 — Newcomb's Paradox, Levi's Case 2 


being 
predicts MR predicts Bi & MR 
chooses 2 495 ,045 935,005 
player | 
chooses Bi & 2 55,005 495,045 


Game 5.8 — Newcomb's Paradox, Levi's Case 3 


being 
predicts Æ predicts Bi & BR 
chooses Æ 90 100,000 
player 
chooses Bl & B2 10 900,000 


In Games 5.7 and 5.8, the answer to the questions "What is the 
frequency with which the being correctly predicted the player's 
choice of only box two?", and "What is the frequency with which the 
being correctly predicted the player's choice of both boxes?", is 
identical to the answer in Game 5.6; namely, 9/10. It follows that 
the maximization of expected utility prescribes a choice of only box 
two, for Games 5.7 and 5.8 as well. All seems as it should be. 

Levi's objection is based on a consideration of converse 
conditional probabilities, p(RyCp) and plPae Cy): in other 
words, "the probability that the being predicts a choice of only box 
two, if the player chose only box two", and "the probability that the 
being predicts a choice of both boxes, if the player chose both 
boxes", respectively. 
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Levi tables the conditional probabilities, and their converses, 


as follows: 


Table 5.1 — Newcomb's Paradox: Conditional Probabilities 
& Their Converses 


PC G/F) PCC pgp Pr) PR Gp) Par" Gui) 
Case 1 0.9 0.9 0.9999888 0.0008991 
Case 2 0.9 0.9 0.9 0.9 
Case 3 0.9 0.9 0.0008991 0.9999888 


And, cautions Levi, 

"Care should be taken, jer Jet Pr Po) the 
conditional probabilities pi ) which 
are conditional eee AA of a renee hear at 
choosing an option given that the demon predicts it with 
the conditional probabilities p(R Cy and p(P mend C meg? 
which are the conditional probabi Es of the demon's 
predicting X's choosing that option given that X chooses 
that option. Assuming that the first two conditional 
probabilities are high, it does not follow that the 
second two conditional probabilities are both high." 


The point is well taken. But Nozick stands accused of having 
made just this "fallacious inference” in his argument supporting the 
E (Nozick 
eventually favours dominance, but first presents detailed arguments 
both for and against both principles of choice.) Based on his accusa- 
tion that Nozick has unjustifiably assumed the converse conditional 
probabilities to be high, Levi concludes 


maximization of expected utility in Newcomb's paradox. 


.not only that Nozick's first argument is invalid 
but that, from the point of view of someone committed to 
using the principle of maximizing expected utility, no 
recommendation can be made concerning what X should do 
without filling in more details concerning Xs predica- 
ment than Nozick has done. . 


Suppose X faces Newcomb's problem and does not himself 
know whether he is facing a variant of case 1, case 2, or 


13 Levi, 1975, p.165. To minimize confusion, Levi's notation has 
been replaced, throughout, by the notation adopted in this enquiry. 
4 Ibid, p.166. 


É mid. 
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case 3. . .X cannot apply the principle of maximizing 
expected utilities." 

This is so because, in Levi's view, the maximization of ex- 
pected utility, against a state of nature, is calculated by using the 
converse conditional probabilities. Thus, for instance, in Levi's 
opinion, maximizing expected utility in cases 1 and 3 prescribes a 
choice of both boxes, not box two only. 

However, it would seem that Levi himself has made a fallacious 
inference, which in turn invalidates his argument against Nozick. In 
the context of Newcomb's Paradox, the converse conditional probabili- 
ties are inadmissible, because they permute the temporal order in 
which the moves are made. Compare the tensed meanings of the condi- 
tional and converse conditional probabilities, p(G/Pp and 
PP Gp) : "the probability that the player chooses only box two, if 
the demon predicted his choice of only box two", and "the probability 
that the demon predicts his choice of only box two, if the player 
chose only box two", respectively. If strict temporal succession of 
moves is to be preserved, then the converse conditional probability 
cannot be admitted, since it entails the being’s prediction of the 
player's choice after the player has made the choice. 

This does not refute Levi's general point, that one cannot 
reflexively equate the values of conditional and converse conditional 
probabilities. However, the set of games to which Levi's point 
applies does not and cannot include Newcomb's paradox, at least as 
Nozick presents it. And as Nozick's use of conditional probabilities 
in his calculation of expected utilities appears to be correct, his 
argument in favour of maximizing expected utilities seems quite 
sound. 

But if one relaxes the temporal constraint, and allows the 
being to make its prediction after the player makes his choice, one 
arrives at a situation resembling the Prisonr's Dilemma, which is 
insensitive to permutations of temporal order of choice. In this set 
of games, one must indeed exercise care to adopt the appropriate 
probability distribution, depending on which player's utilities one 
seeks to maximize. 


1 Ibid, p.166 & pp.173—4. 
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Before concluding this chapter, it must be fairly said that 
Newcomb's paradox admits of many other avenues of investigation, 
which may be broader, longer, and also more winding than that ex- 
plored herein. For this enquiry's purpose, it is sufficient to note 
salient respects in which the Prisoner's Dilemma both can, and cannot 
be considered as a kind of Newcomb's paradox. 

It seems interesting and relevant that the reformulation of the 
paradox points to a similar theoretical resolution of the dilemma as 
does the implementation of collective rationality; namely, adoption 
of the principle of maximizing expected utility. Unfortunately, when 
the dilemma occurs on a single and isolated occasion, neither prison- 
er has recourse to a posteriori probability distributions, and both 
prisoners must rely. on a priori probabilistic criteria. ÅS such, 
there seems no reliable way for a given prisoner to assess either 

(1) the probability that the other prisoner is collectively 
rational, or 

(ii) the probability that the other prisoner can correctly 
predict the given prisoner's choice. 

The appeals to collective rationality of choice (in the previ- 
ous chapter) and to mutual predictability of choice (in this chapter) 
hold more in common than their respective theoretical but not-quite- 
realizable resolutions of the Prisoner's Dilemma; both make implicit 
d The next 
chapter examines an explicit meta-game-theoretic resolution of the 


use of meta-game theory in their respective processes. 


static Prisoner's Dilemma. 


Ul see also Brams, 1975, p.608. 
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Chapter Six 


A Meta-Game Resolution 


Å meta-game is defined by Howard in the following way: 


"If Gis a game in normal form, and if k is a player in 
G, the (first-level) metagame KG . . .is the normal-form 
game that would exist if player k chose his strategy in G 
in knowledge of the other players’ strategies (in G)." 


It is understood that the phrase "in knowledge of" is congruent with 
the epistemological concept of knowledge as "justified true belief". 
Hence, in the absence of factual knowledge, if player k assigns 
probability distributions to the other players' strategies, or 
otherwise hypothesizes about them, then player k is considered to be 
playing a meta-game. . 

This consideration endorses the claim that the implementation 
of collective rationality and the reformulation of the Prisoner's 
Dilemma matrix both entail (first-level) meta-games. In order to 
apply the working definition of collective rationality adopted 
herein, prisoner A must assign a probability value to the proposition 
that prisoner B is collectively rational. This involves A's delibera- 
tion of prisoner B's principle(s) of choice. Similarly, if prisoner A 
reformulates the game matrix, he must determine the probability with 
which prisoner B will correctly predict A's choice. This also invol- 
ves A's deliberation of B's principle(s) of choice. In this sense, A 
is playing a meta-game in both cases. 

Howard defines a meta-game, not in terms of principles of 
choice, but in terms of strategies. Note the difference: a principle 
can be thought of as a decision rule which generates a single choice; 
a strategy, as a decision rule which generates a sequence of choices. 
In both meta-games and iterated games, the former gives way to the 
latter. At the same time, the meta-game still takes place in the 
static mode. Meta-game strategies are said to expand the matrix,’ not 


l N. Howard, ~"General Metagames": An Extension of the Metagame 
Concept’, in Rapoport (ed.), 1974, pp.260-83. See also idem, 1971. 


2 E.g. see A. Rapoport, “Escape From Paradox’, Scientific 
American, July 1967, pp.50-56. 
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to iterate it. Nevertheless, meta-game decision rules do generate 
sequences of choices, amd are therefore justly termed strategies. 

At this stage, it is appropriate to mention that in any game 
(or meta-game), the number of possible strategies tends to preponder— 
ate greatly over the number of possible moves. Even ina relatively 
Simple game, the preponderance can be numerically staggering. To 
illustrate the point, one can cite Rapoport's example of tic-tac- 


toe.) 


He computes the number of possible ways to make the first five 
moves as 9x8x7x6x5, or 15,120. But player Å, who moves first, has 9 
possible strategies on move one, then 7 possible strategies on move 
three (by associating any of his seven replies to any of B's eight 
choices), then 5 possible strategies on move five (by associating 
any of his five replies to any of B's six choices), for a total of 
9x7 x$, or 8.1 x 10 possible strategies. Player A has an average 
ratio of more than fifty-three million strategies per move. 

As Rapoport mentions, considerations of symmetry would reduce 
these numbers by several orders of magnitude, but the ratio of 
strategies to moves would still remain exceedingly large.” If one 
applies this calculus to the relatively complex game of chess, the 
mind rapidly boggles at the number of possible strategies available 
to the players.’ 

The point applies to the Prisoner's Dilemma as well. In the 
static meta-game, each player faces only two possible moves, but has 
recourse to a multiplicity of possible strategies. The idea in How- 
ard's meta-game resolution is to eliminate the dominance principle by 


introducing "conditional" strategies, which generate an expanded 


3 Idem., 1960, pp.146-7. 
Å mid. 


5 White has 20 possible opening moves; so does Black. White has 
no fewer than 20 possible second moves; so does Black. The number of 
possible ways in which the players can each make two moves is 20x20x- 
20x20, or 160,000. But White has 20 possible strategies on his first 
move, and fewer than 20° possible strategies on his second move. 
or 2 x 10° possible strategies for only two moves. Black has 20 
possible strategies on his first move, and no, fewer than 20" pos- 
sible strategies on his second move, or 1 x 10° possible strategies 
for only two moves. 
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matrix containing new equilibrium outcomes. 

Suppose prisoner A has four conditional strategies: 

A: A co-operates regardless of B's expected choice. This 
strategy always generates choice C. 

A, : A chooses what he expects B to choose. This strategy gener- 
ates choice 6, which can be either C or D. 


A, : A chooses what he expects B not to choose. This strategy 
generates choice 8*, which can be either Dor C. 


A, : A defects regardless of B's expected choice. This strategy 
always generates choice D. 


Let the payoffs to the prisoners be represented in utiles, as 
follows: 


Game 6.1 — Prisoner's Dilemma 


A 
C D 
c 1,1 -2,2 
B 
d 2,-2 =l,-1 


Now let prisoner A's conditional strategies be applied to Game 
6.1. The following meta-game matrix obtains: 


Game 6.2 — First-Level Meta-Game of Game 6.1 


A 
A 4 A; 4 
(0) (8) (8%) (D) 
c 1,1 1,1 299 252 
B 
d 25 -1,-1 2223 -1,-1 


In game 6.2, prisoner A's choices are generated by his respec- 


tive conditional strategies. The choices themselves are parenthe- 
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sized, in order to emphasise their dependence on the respective 
strategies. 

In game 6.2, prisoner B no longer has a dominant choice. For B, 
defection dominates co-operation if A adopts Å ; A or A,» but co- 
operation dominates defection if A adopts A, . If Bwere to maximize 
his expected utility, by assigning (via the principle of insufficient 
reason) an equiprobability of 1/4 to the likelihood that A adopts any 
strategy, then El(c) = -1/2 and EU(d) = 1/2. With this probability 
distribution, maximizing expected utilities prescribes defection. But 
B has no assurance that A's choice among strategies will be made 
equiprobabilistically. How, then, is B to choose? 

B might examine the matrix from A's point of view, and ask 
himself whether A has a dominant strategy. Then B would find that A 
stands to minimize his losses by choosing either A, or 4, and to 
maximize his gains by choosing either 4, or 4. If A is individually 
rational, his best strategy is therefore A, . But if A is collectively 
rational, and believes B to be collectively rational also, then A 
might risk choosing A,. So Bhas narrowed the field of A's likely 
choices, to A, and à. In effect, B is now looking at the following 
reduced game: 


Game 6.3 — Sub-Matrix of Game 6.2 


A 
A, 4 
(B) (D) 
Cc 1,1 -2,2 
B 
d 1,1 e = 


B now realizes that, for prisoner A, strategy A, weakly domina- 
tes A,. Thus, if Bhas no extenuating reason to believe that A is 
collectively rational, B must defect in order to protect himself 
against A's impending defection. In sum, game 6.2 offers no resolu- 
tion to the Prisoner's Dilemma. Weak dominance lures A, and thus 
impels B, to the mutually detrimental outcome of (-1,—1). 
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But all is not yet lost. Howard defines the second-level meta- 
game as follows: 


«by recursion, the second-level metagame JKG, where 
J ‘and k are players, is the game in which J chooses his 
strategy (in kG) in knowledge of the other's strategies 
(in kG); in terms of strategies in G, it is a game in 
which j reacts (a) to k's reactions to the actions of the 
players other than k; (b) to the actions of the players 
other than j and k." 


Only condition (a) is applicable to the two-person Prisoner's Dilem- 
ma. In terms of condition (a), then, let prisoner B react to prisoner 
A's strategies (which are A's reactions to B's possible actions) by 
taking first-level meta-game 6.2 to a second-level metagame. 

B associates two choices (either c or d) with each of A's four 
possible strategies, and so B generates 2, or sixteen, possible 
meta-strategies. The following meta-meta-game obtains: 


Game 6.4 — Second-Level Meta-Game of Game 6,1 


A 
A, 4 A, 4 
1544 (C) (B) (8%) (D) 
ccce 121 1,1 -2,2 -2,2 
cccd 1,1 1,1 -2,2 -1,-1 
ccdc 1,1 1,1 2,72 -2,2 
cdcc 1,1 -1,-1 -2,2 -2,2 
dccc 2,72 1,1 -2,2 -2,2 
ccdd 1,1 1,1 2,—2 -1,-1 
cdcd 1,1 -1,-1 -2,2 —1,-1 
B dccd 2,-2 1,1 -2.2 —1,-1 
cddc 1,1 -1,-1 22 -2,2 
dcdc 2,72 1,1 2,72 -2,2 
ddcc 2,-2 -1,-1 -2,2 -2,2 
dddc 2;-2 -1,-1 2,72 -2,2 
ddcd 2,72 -1,-1 -2,2 e i À 
dcdd 2,-2 1,1 22 -=1,-1 
cddd 1,1 1-1 2552 -1,-1 
dddd 2-2 -1,-1 2,=2 -1,-1 


é Howard, in Rapoport (ed.), 1974, p.261. 
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In game 6.4, prisoner A has no dominant strategy. If A were to 
maximize his expected utility, by assigning (via the principle of 
insufficient reason) an equiprobability of 1/16 to the likelihood 
that B adopts any meta-strategy, then 


EU(A ) = -1/2 
EUM) = 0 
EUA) = 0 
EUA) = 1/2 


Given such a probability distribution, maximization of expected 
utilities prescribes unconditional defection. But A has no assurance 
that B will choose his meta-strategy equiprobabilistically. How, 
then, is A to choose? A might examine the matrix from B's point of 
view, and ask himself what B would choose. 

If B examines column one of game 6.4, he notices that, should A 
choose strategy A , then eight of B's sixteen meta-strategies result 
in a payoff (to B) of two utiles; the other eight, of only one utile. 
B defines the set of meta-strategies, whose members yield two utiles, 
as S . Explicitly, 


S = {dccc, dccd, dede, ddcc, dddc, ddcd, dcdd, dddd} 


Similarly, should A choose strategies Å ; A. or 4, , B defines 
5 , q and 5 as those respective sets of meta-strategies whose 
members yield a better payoff to B (of one as opposed to minus one 
utiles, two as opposed to minus two utiles, and minus one as opposed 
to minus two utiles; in columns two, three, and four respectively). 
Explicitly, 


(cccc, cccd, codec, dccc, ccdd, dccd, dcdc, dedd) 
= {ccdc, ccdd, cddc, dcdc, dddc, dcdd, cddd, dddd} 
(cccd, ccdd, cdcd, decd, ddcd, dcdd, cddd, dddd} 


sn 
| 


A 
| 


De 
| 


Now B finds the intersection of these sets: 
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In other words, the choice of meta-strategy dcdd guarantees 
that, no matter what strategy A chooses, B cannot fare better by 
choosing any other meta-strategy. B's next-best meta-strategy is 
ccdd, which yields identical results except when A chooses A, , in 
which case ccdd yields only one utile to B (as opposed to two utiles 
yielded by dcdd). Although dedd is neither strongly nor weakly 
dominant, it is the sole meta-strategy that yields the best possible 
result to B irrespective of what A chooses. In consequence, this 
meta-strategy exhibits a novel property which may, with methodologi- 
cal justification, be termed set-theoretic dominance.! 

If, on the other hand, B elects to maximize his expected 
utility, again assigning equiprobable values of 1/4 to each of A's 
four strategies, he finds 


EWdcdd) = 1 


which is indeed the maximum expected utility for all meta-strategies. 
(Again, ccdd is next-best, with EU(ccdd) = 3/4.) Both set-theoretic 
dominance and maximization of expected utility prescribe meta-stra- 
tegy dcdd. And set-theoretic dominance does so analytically, without 
recourse to the principle of insufficient reason or any other poten- 
tially objectionable a priori probabilistic rule. 

To recapitulate: prisoner A's examination of the matrix of 
meta-game 6.4 leads him to the realization that prisoner B possesses 
a set-theoretically dominant meta-strategy, dcdd. Now Å reasons that, 
if Bis individually rational, B will opt for set-theoretic domin- 
ance. In that case, if A is individually rational, A should choose 
strategy À, not A. Although strategy A, has a higher expected util- 
ity than Å , those expected utilities were found by assigning a 


1 Reflection shows that if a strategy is strongly dominant, or 
weakly dominant, then it is also set-theoretically dominant. In other 
words, both strong and weak dominance imply set-theoretic dominance. 
But set-theoretic dominance implies neither strong nor weak domin- 
ance, and therefore comprises a weaker but logically distinct cate- 
gory of dominance. 


100 


uniform probability distribution to B's choice of meta-strategies. 
But, given that B is individually rational, the existence of B's set- 
theoretically dominant meta-strategy nullifies A's application of the 
principle of insufficient reason, and its consequent assignment of a 
priori equiprobabilities. 

If Bis individually rational, B chooses the set-theoretically 
dominant meta-strategy dcdd. Now A examines the sub-matrix of the 
reduced game that obtains: 


Game 6.5 — Sub-Matrix of Game 6.4 


B dodd 2,72 1,1 2:2 yt 


Now Å has a strongly dominant strategy; namely, A, - Thus if À 
is individually rational, A chooses strategy Å . Now both prisoners 
realize that the individually rational joint choice is (A, , dedd) , 
whose joint outcome is the mutually beneficial (1,1). Thus, indivi- 
dual rationality leads both prisoners to co-operate, and both priso- 
ners gain one utile. 

This is in marked constrast to the previous games in this 
chapter. In game 6.1 (the Prisoner's Dilemma), strong dominance leads 
to the mutually-detrimental equilibrium outcome (-1,-1). In game 6.2 
(first-level meta-game of 6.1), weak dominance leads to the same 
result: (-1,-1). But in game 6.4 (second-level meta-game of 6.1), 
set-theoretic dominance leads to the mutually-beneficial, Pareto- 
optimal outcome (1,1). In the second-level meta-game, the dominance 
principle has finally "reversed" its prescription. 

And at the same time, a significant change occurs with respect 
to rational choice. In game 6.4, individual rationality leads both 
prisoners to mutual co-operation. Now suppose B is collectively 
rational. What should he do? Clearly, Bshould still choose meta- 
strategy dcdd, for two reasons. First, it contains a Pareto-optimal 


outcome, thus enabling both prisoners to benefit by mutual co-opera- 
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tion if A is also collectively rational. Second, if A is not collec- 
tively rational, then meta-strategy dcdd remains set-theoretically 
dominant, since it is the only choice that yields the maximum pos- 
sible payoff to B no matter what A chooses. So, whether Bis an 
individual ora collective rationalist, he chooses meta-strategy 
dodd. | 

Now suppose Å is collectively rational. Then, by definition, he 
chooses strategy A, which, by definition, consists in choosing what A 
expects B to choose. So, whether Å is an individual or a collective 
rationalist, he chooses strategy A, ; 

This turn of events, in which either type of rationality (or 
indeed a mixture of types) leads to the same meta-strategic outcome, 
(A ,dedd), evinces a singular observation by Rapoport: that indivi- 
dual and collective rationality are reconciled in this second-level 
meta-game of the Prisoner's Dilemma.® 

However, the undeniable elegance and ingenuity of Howard's 
meta-game-theoretic resolution do not dispel the problem that haunts 
Prisoner's Dilemmas; rather, they transform it into a meta-problem. 
The reconciliation of individual and collective rationalities does 
not, in and of itself, guarantee that the dilemma is resolved. 

This point can be illustrated by a thought-experiment. Suppose 
three hundred pairs of people are to be placed in two-person Priso- 
ner's Dilemmas, with payoffs as in game 6.1. With respect to the 
participants, the pairs are formed by blind random selection, so that 
no prisoner knows the identity of the other prisoner in his par- 
ticular game. The three hundred pairs are then formed into three 
groups of one hundred pairs each, again by random selection. It is 
assumed that none of the six hundred participants has any prior 
knowledge of game theory, or of the Prisoner's Dilemma. 

The three groups are then isolated from one another, and each 
group is given a preparatory lecture on the Prisoner's Dilemma. These 
lectures, however, are not identical. 

Group One is informed about conflicting principles of choice 
and diverging concepts of rationality. The nature of the dilemma is 


8 Rapoport, 1967, pp.55-6. 
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made clear, but no meta-game resolution is presented. 

Group Two is given the same lecture as Group One, and is 
additionally informed about the meta-game resolution of the dilemma. 

Group Three is given the same lecture as Group One (with no 
information about the meta-game resolution). 

Each pair of each group then plays one game of Prisoner's 
Dilemma. The players make their choices by a ballot-box method, and 
thus retain mutual anonymity. (The ballots are coded in such a way 
that only the experimentor, and not the players, can associate them 
pair-wise.) 

Next, the groups are asked to play one more game each. Groups 
One and Two proceed as above, but Group Three is first given an 
additional preparatory lecture, in which the meta-game resolution is 
presented. Then Group Three proceeds as above. 

The empirical question is, of course, whether an awareness of 
the meta-game resolution makes mutual co-operation more frequent. If 
it does, then Group Two should show a greater frequency of mutual co- 
operation than Group One. And if it does, then Group Three's frequen- 
cy of mutual co-operation in its first hundred trials should corre- 
late with that of Group One, and in its second hundred trials with 
that of Group Two. 

It might be interesting to conduct this experiment, or some 
variation of it. (If such an experiment has already been conducted, 
may this enquiry's ignorance be excused.) 

The poirt to be made is simply this: there is no å priori 
guarantee that an awareness of the meta-game resolution promotes 
unconditional mutual co-operation. Such awareness can have the 
opposite effect, in exactly the same way as consideration of collec- 
tive rationality, and a reformulation of the game-matrix, can result 
in non-Pareto-optimal outcomes. If the meta-game resolution convinces 
prisoner A to co-operate, it may equally well convince prisoner B to 
exploit A by defecting (and vice-versa). Similarly, both may be 
tempted into defection, each hoping to exploit the other. 
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Another difficulty with the meta-game resolution is, as Rapo- 
port points out, that it requires translation into a social context.” 
With respect to conflict resolution, one may infer that Rapoport 
wishes for a more significant translation than that which merely 
utilizes the resolution as the basis for an experiment in social 
psychology. 

In any case, the Prisoner's Dilemma persists in the static 
mode, resisting attempted resolutions by re-definition of rational- 
ity, reformulation of game matrix, and expansion into meta-game 
decision space. These attempted resolutions are of undeniable theore- 
tical value, as they acknowledge that a mutually beneficial outcome 
is most certainly attainable. And in practice, while these resolu- 
tions cannot compel the prisoners to co-operate, they can make a 
profound appeal for collective rationality to prevail. Indeed, the 
meta-game resolution demonstrates that, in the last analysis, any 
type of rationality (whether individual or collective) is preferable 
to irrationality. 

The enquiry now turns to an examination of the Prisoner's 
Dilemma inthe iterated mode, in order to investigate strategic 
interactions that obtain therein. 


å Ibid., p.56. 


PART THREE: 
THE ITERATED PRISONER'S DILEMMA 
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Chapter Seven 
A Tournament of Strategic Families 


In 1966, Rapoport asserted that 


" “What is the best way to play chess?' is not a game— 
theoretical question. On the other hand, “Is there a, best 
way to play chess?' is a game-theoretical question." 


Rapoport's assertion is a reflection of the taxonomic orientation of 
game—theory . 

The theory answers Rapoport's second question in the affirma— 
tive. Chess is classified as a two-person, constant-sum game of 
perfect information that is strictly determined. As such, chess 
belongs to the same class of games as tic-tac-toe. Thus von Neumann 
and Morgenstern remark: 


"This shows that if the theory of Chess were really fully 
known there would be nothing left to play...” 


To appreciate why this is so, consider tic-tac-toe, which ceases to 
fascinate players as soon as they realize that it is always possible 
for either player to force a draw. Between two experienced players, a 
drawn outcome is a foregone conclusion. From a game-theoretic point 
of view, chess differs only in so far as the number of possible chess 
games is incomparably greater than the number of possible games of 
tic-tac-toe. So, even between two experienced chess players, a draw 
is not a foregone conclusion, since either or both players are likely 
to become entangled ina welter of possible combinations of moves, 
and thus fail to find the "best move" in a given situation. If all 
chess players continually found their best moves, all chess games 
would be drawn. 

But the theory of games does not provide an answer to Rapo- 
port's first question (“What is the best way to play chess?’). The 
theory of games merely asserts that a "best" chess move always 
exists, on every turn of every game, whether or not a player actually 
finds it. The theory does not purport to tell the player how to find 
it. The player who wishes to answer this question must consult the 


l Rapoport, 1966, p.14. 


; Neumann & Morgenstern, 1955, p.125. 
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copious literature on the theory of chess openings, middle-games, and 
end-games. The player who does so will soon discover that, although 
the literature abounds with sound recommendations on how not to play 
the game, the vast number of possible combinations of good moves 
precludes exhaustive deterministic analysis. 

In 1980, Axelrod enquired "What is the most effective way to 
play the iterated Prisoner's Dilemma?" At first blush, this question 
may seem not only non-game-theoretic, but also self-contradictory. It 
seems to assume that the game-theoretic question "Is there a most 
effective way to play the iterated Prisoner's Dilemma?" can be 
answered in the affirmative. But the Prisoner's Dilemma belongs to 
the class of non-zero-sum, non-co-operative games. So game-theory 
asserts that, in the static mode, there does not exist a “most 
effective way" to play the game. (And Part Two of this enquiry 
certainly corroborates the theory, if by "most effective way" one 
understands "the way that guarantees a Pareto-optimal outcome".) How 
then, one may ask, can Axelrod expect to find, in the iterated mode, 
something which the theory pronounces non-existent in the static 
mode? 

To answer this, one recalls the von Neumann-Morgenstern dis- 
claimer: 


"We make no concessions: Our viewpoint is static and we 
are analyzing only a single play.” 


Since its formal articulation by von Neumann and Morgenstern, the 
theory of games has been extended into many new domains, including 
that of iterated games. Rapoport has made significant contributions 
> and has thus helped 
the theory to keep pace with the profusion of iterated experiments. 


to the theory of iterated Prisoner's Dilemmas, 


Theories of the iterated Prisoner's Dilemma posit a tendency 


toward joint similar play (either mutual co-operation or mutual 


3 Axelrod, 1980a & 1980b, p.3 & p.379 respectively. 


i Neumann & Morgenstern, 1955, p.147. 


à E.g. for treatments of Markov chain models, equilibrium 
models, stochastic learning models, and classical dynamic models, see 
Rapoport & Chammah, 1965, pp.115-50. For an inductive theory of 
iterated Prisoner's Dilemmas, see Rapoport, 1966, pp.145-57. 
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defection) .* Which outcome obtains depends upon initial conditions 
and the length of the game. Rapoport also finds agreement between 
theory and experiment: | 


"The initial gross trend in repeated plays of Prisoner's 
Dilemma is toward more defection. After a while “recov- 
ery' sets in, and the frequency of co-operative responses 
increases. This recovery is relatively quick and pronoun- 
ced when the matrix is displayed but comes much later and 
is relatively weak when the matrix is not displayed. The 
steady decline of the unilateral states, i.e. the in- 
creasing predominance of CC and DD states, is evidently 
responsible for the fact that paired players become more 
and more pike each other in repeated plays of Prisoner's 
Dilemma." 


Given these developments, it seems quite reasonable to assume 
that some ways of playing the iterated Prisoner's Dilemma are more 
effective than others (if by "more effective" one understands "more 
likely to lead to repeated mutual co-operation than to repeated 
mutual defection"). Armed with that assumption, Axelrod conducted two 
experiments aimed at discovering precisely which ways might prove 
more, and less, effective. The experiments were conducted as com- 
puter-run tournaments, in which competitors submitted strategies in 
the form of programs. 

Before summarizing the results of Axelrod's tournaments, and 
prior to analyzing the results of a third tournament conducted as 
part of this enquiry, one must state Axelrod's overriding conclusion: 
there is no "best" strategy independent of environment .° This con- 
clusion is consistent with the theory of the Prisoner's Dilemma. No 
strategy is unconditionally Pareto-optimal, in either the static or 
the iterated mode. But, ina given environment, certain strategies 
may prove more conducive to Pareto-optimality than others. Having 
defined a particular environment, Axelrod is able to identify such 
strategies. 


6 Ibid. 


7 Rapoport & Chammah, 1965, p.102. See also R. Clarke, The 
Science of War and Peace, Jonathan Cape, London, 1971, pp.281 ff. 


å Axelrod, 1980a & 1980b, p.21 & p.402 respectively. 
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An analogy with Darwinian evolutionary theory might be ap- 
propriate. Within a given species, genetic mutations give rise to 
phenotypic differences. Some phenotypes are more favoured by natural 
selection than are others. If one asked a Darwinian evolutionary 
theorist "What kind of phenotype is best-adapted?" for a given 
species, the theorist would almost certainly reply that there is no 
"best" adaptation independent of environment. 

If one takes into account local environmental factors (such as 
climate, terrain, indigenous flora and fauna, species-specific 
habits, and so forth), then conjectures about the adaptiveness or 
non-adaptiveness of a particular phenotype become meaningful. And 
selective domestic breeding, enhanced by knowledge of genetic theory, 
enables the directed: dispersion or suppression of a given phenotype's 
frequency in successive generations of a particular population in a 
controlled environment. 

Similarly, by studying the interactions of different strategies 
under varying conditions, one may gradually identify properties which 
tend to make a given strategy more (or less) effective in a par- 
ticular strategic population in a defined environment. Flexible 
strategies can be modified, until their performance is optimally 
effective in the context of their competitors and surroundings. 

The key environmental factors in Axelrod's tournaments are: the 
payoffs to the players, the number of iterations in a game, the 
players' knowledge of this number, amd the actual strategies in 
competition. Let the import of these be discussed in turn. 

First, consider the iterated game matrix (game 7.1, overleaf). 
The transitivity of the payoffs in the iterated mode is identical to 
that in the static mode. But game 7.1 differs from game 4.1 with 
respect to the added constraint, Å > (1/2)(S+T). This constraint is 
necessary in iterated games. If (StT) > 2R, the players would gain 
more by alternating choices (C,d) and (D,c) than by mutual co-opera- 
tion (Cc). As Rapoport points out, the players would then have at 
their disposal a form of tacit collusion, which is not supposed to 


occur in a Prisoner's Dilemma. 


9 Rapoport & Chammah, 1965, pp.34-35. 
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Game 7.1 — The Iterated Prisoner's Dilemma 


where T> R> P>? S 
and R > (1/2) (ST) 


The actual payoffs used by Axelrod are: 


Game 7.2 — Axelrod's Tournament Matrix 


B 
c d 
C 3,3 0,5 


D 5,0 tl 


All competitors were aware of this payoff structure before 
submitting their strategies. In game 7.2, one can see explicitly how 
the constraint applies. If the players alternate choices of (C,d) and 
(D,c), they each gain an average of 2.5 utiles per move. If they both 
co-operate, they each gain 3 utiles per move. Thus the constraint R > 
(1/2)(&T) discourages tacit collusion, by making it less profitable 
than mutual co-operation. 

Second, the number of iterations in a game is important because 
some strategies employ a posteriori probability considerations; in 
other words, they look at relative frequencies of past outcomes. If 
the game is not long enough, these frequencies may fail to attain 
sufficient closeness to their limiting values (where such values 
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exist), and the accuracy of the probability calculus in question may 
be impaired." 

Third, the players' knowledge of the length of a game (rather 
than the length of a game itself) can give rise to the phenomenon of 
"reverse induction" .!! If a game is known to consist of J moves, then 
player A may be tempted to defect on the jt move, since there will 
be no (#1) opportunity for player B to retaliate. If player B 
reasons similarly, mutual defection occurs as the Pa outcome. In 
that case, player Å reasons that he may as well defect on his (j1) 
move as well, since player B will defect on the next move in any 
case. If player B reasons similarly, mutual defection will occur on 
the (#1)* move. Then, by reverse or backward induction, the players 
will defect on all moves. 

Axelrod found partial empirical confirmation of this phenome- 
non. In his first tournament, the length of each game was held 
constant at two hundred moves, and all players were informed of this 
in advance. Some players submitted strategies which, regardless of 
their respective decision rules operative throughout most of the 
game, defected unconditionally during the last several moves, in the 
hope of exploiting both intrinsically co-operative strategies and 
strategies too slow to retaliate. In his second tournament, Axelrod 
amended the ground rules such that 


", . ethe length of the games was determined probabilis- 
tically with a 0.00346 chance of emding with each given 
move. This parameter was chosen so that the expected 
median length of a game would be 200 moves. . . Since no 
one knew exactly when the last move would come, end-game 
effects were successfully avoided in the second round." 


10 For example, in a different context: if one attempts to 
assess the "fairness" of a coin by tossing it only ten times, it 
might yield two heads and eight tails. This would not justify the 
conclusion that the coin is unfair. A fair coin will approach a 
limiting frequency of n/2 heads and n/2 tails per n tosses, as the 
number of trials increases. To obtain a result within a desired 
degree of closeness to this distribution, one must make a correspond- 
ingly large number of trials. E.g. see von Mises, 1981, passim. 


ll Rapoport & Chammah, 1965, pp.28-29. 


2 Axelrod, 1980b, p.383. 
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Fourth, the actual strategies in competition are a major 
determining factor of the effectiveness (or ineffectiveness) of a 
given strategy. For instance, both of Axelrod's tournaments were won 
by a strategy submitted by Rapoport, called "tit-for-tat" (acronym 
IFT). TFT is a simple and elegant decision rule: it co-operates on 
its first move, and plays next whatever its opponent played previous- 
ly. TFT is the game-theoretic equivalent of lex talionis. But, 
notwithstanding TFT's impressive performance (it defeated fourteen 
other entrants in the first tournament, and sixty-two others in the 
second), Axelrod is able to furnish several reasons why he considers 
TFT not to be the "best" strategy in iterated Prisoner's Dilemmas. 

In the first tournament, relative success among the eventual 
top eight strategies turned out to be heavily influenced by the 
presence of two "kingmaker" strategies, so-called because they did 
not finish well themselves, but largely determined the order of 
finish among the top eight.” TFT fared better against these king- 
makers than did any other strategy. 

A strategic environment is shaped not only by the presence of 
certain strategies, but also in the absence of others. Again with 
respect to the first tournament, Axelrod cites three strategies, any 
of which would have won had it been submitted. 

The first, somewhat ironically, was a "sample program” sent to 
prospective contestants, in order to illustrate the desired format 
for a submission. The strategy itself is "tit-for-two-tats" (acronym 
TIT). TIT co-operates on the first two moves, and defects only after 
its opponent defects on two consecutive moves. The sample program of 
TIT, says Axelrod, 


. . Would in fact have won the tournament if anyone 
had gjeply clipped it and mailed it in! But no-one 
did." 


The second unsubmitted winning strategy was also available to 
most contestants, since it was included in a report of a preliminary 
tournament circulated for subsequent recruitment. The strategy used å 
look-ahead, tree-searching technique that is popular in artificial 


I Idem., 1980a, pp.10-13. 


1 Ibid, p.20. 
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intelligence programs. It will not be considered in this enquiry's 
game-theoretic context. 

A third strategy, which ranked only tenth among fifteen en- 
trants, would have won the tournament with a slight modification. 
Called Downing (after its submittor), this strategy is none other 
than the maximization of expected utility. Downing, faced with the 
task of assessing the probability of each opposing strategy's co- 
operation or defection before the commencement of play, without 
knowing explicitly against which strategies he would compete, had 
recourse to the a priori principle of insufficient reason. He ini- 
tially assigned equiprobable values of (1/2, 1/2) to the likelihood 
of an opposing strategy's co-operation or defection. His program then 
updated this arbitrary probability distribution after each move, 
according to the actual move made by the given opposing strategy. 

Had Downing selected a more optimistic initial weighting: i.e. 
one that assumed a greater å priari likelihood of co-operation and a 
correspondingly lesser a priori likelihood of defection on the part 
of opposing strategies, then Axelrod asserts that Downing would have 
won by a large margin.” 

Axelrod does not say which of these three hypothetical winning 
strategies would have prevailed had they all been submitted, but his 
point about the relativity of TFT's success is well taken. 

And in the second tournament, although the number of entrants 
more than quadrupled, TFT once again proved its relative superiority, 
prevailing against all competitors. Nevertheless, Axelrod steadfastly 
maintains that TFT is not the "best" decision rule in the iterated 
Prisoner's Dilemma, and gives three reasons why not. 

Firstly, Axelrod describes a hypothetical strategy that would 
have won the second tournament, had it been submitted. Such a stra- 
tegy would have the necessary property (which no other submission 
manifested) of being able to identify and defect against a random 
strategy, while not mistaking any non-random strategy for a random 
one. This property is difficult to implement; Axelrod confesses that 


5 tid. 


1 Idem., 1980b, pp.401-402. 
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he attempted to write such a program himself, but the program did not 
perform ideally.” 

Secondly, Axelrod observes that, hada third tournament been 
conducted, witha field of entries composed solely of the upper half 
of the second tournament standings, then TFT would have ranked only 
fourth. The first, second, and third places in this hypothetical 
third tournament would have been occupied by strategies which placed 
twenty-fifth, sixteenth, and eighth, respectively, in the second 
tournament. 

This observation is an excellent illustration of the dependence 
of a given strategy's success on the constitution of the overall 
strategic population. In the second tournament, the twenty-fifth- 
ranked strategy evidently fared magnificently against the upper half, 
but this success was altogether marred by its exceedingly poor 
performance against the lower half. The standings of the second 
tournament completely disguise the fact that the twenty-fifth-ranked 
strategy is actually the best among the upper half of the whole, if 
one discards the lower half. This example shows the necessity for 
caution when drawing inferences about the strength or weakness of a 
given strategy. 

Thirdly, Axelrod re-states his overriding conclusion: TFT 
cannot be the best strategy in the iterated Prisoner's Dilemma, 
because there is no "best" strategy independent of environment . $ 

Nonetheless, it is germane to speak of the "successfulness" of 
a particular strategy, as a relative indicator (rather than an 
absolute measure) of its performance in a given environment. Axelrod 
describes a "successful" strategy, in the context of his tournaments, 
in terms of three attributes: "niceness", "provocability", and 
"forgiveness". ” A strategy is said to be nice if it never defects 
first; provocable, if it is able to defect in response to defection; 
forgiving if, after having been provoked, it is able to co-operate in 
response to renewed co-operation. 


UV Ibid, p.402. 


B Ibid. 


D Ibid, pp.389-95. 
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In the first tournament, the eight top-ranking strategies were 
nice; the bottom six, not nice.” In the second tournament, fourteen 
of the fifteen top-ranking strategies were nice; fourteen of the 
bottom fifteen; not nice.” Thus Axelrod found some correlation 
between niceness and success. 

The other two attributes’ effects on a strategy's success are 
more difficult to gauge. TFT, for example, is rapidly provocable yet 
quick to forgive. By playing next whatever its opponent played 
previously, it swiftly punishes defection, but harbours no grievance 
in the process. It judges each opponent's move on its individual 
merit, and disregards the history of the game. TIT is just as forgiv- 
ing, but less provocable, since it defects only after two consecutive 
defections by an opponent. Recall that 77T would have won the first 
tournament, but was ironically not submitted. It was submitted in the 
second tournament, but ironically did not win. 

Thus TFT won the first tournament because of the absence of 
TIT, and won the second tournament despite its presence. Again, this 
result reflects the dependence of a strategy's success upon the other 
types of strategies in the competing population. The extent to which 
a given degree of provocability or forgiveness conduces to success, 
is therefore also environment-dependent. 

A strategy that manages to perform with relative success ina 
variety of environments is said to be "robust". 4 In Axelrod's two 
tournaments, TFT demonstrated greater robustness than any other 
strategy submitted. In sum, according to Axelrod's findings, a robust 
strategy should possess the property of niceness, and be imbued with 
a propitious combination of the qualities of provocability and 
forgiveness. 

Axelrod, however, adds the following caveat: 


"Being able to exploit the exploitable without paying too 
high a cost with the others is a task which was not 


À Idem., 1980a, p.9. 


Al Idem., 1980b, pp.389-90. 


2 Ibid, pp.396-8. 
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successfully accomplished by any of the entries in round 

two of the tournament." 

This remark gives pause to wonder whether a strategy's robustness 
might improve if it were imbued with the additional quality of 
"exploitiveness". An exploitive strategy, then, would be able to 
exploit the exploitable without becoming vulnerable to exploitation 
itself. 

This completes a sketch of Axelrod's environment for the 
iterated Prisoner's Dilemma, in terms of four key factors (the 
payoffs to the players, the number of iterations in a game, the 
players' knowledge of this number, and the actual strategies in 
competition). Axelrod organizes and presents both tournaments' 
results in a highly interesting fashion, and this enquiry will 
continue to refer to, amd aspire to develop, particular aspects of 
his findings. 

It can be observed that Axelrod's tournaments are quintessen- 
tially game-theoretic in spirit, since they provide environments for 
strategic competition while eschewing involvement in the psychology 
of the strategists themselves. In an iterated Prisoner's Dilemma, a 
player adopts a particular strategy, which generates choices accord- 
ing to its decison rule. A game-theorist is interested in the rela- 
tive robustness of the given strategy; he is not concerned with the 
strategist's psychological motives for adopting it. 

And although it is assumed that each strategist prefers his 
strategy to fare as well (and not as poorly) as possible in competi- 
tion,” an assessment of a particular strategy's success can be made 
quite independently of its strategist's rationality, or irrationa- 
lity. 

Suppose ten players compete in an iterated Prisoner's Dilemma. 
Let player A's strategy be: defect on Mondays, and co-operate on 
other days of the week. Let the remaining nine players' strategies 


B Ibid., p.403. 


a Å known exception is the thirteenth-ranked strategy in Axel- 
rod's first tournament, submitted "out of scientific interest rather 
than an expectation that it would be a likely winner”. See Axelrod, 
1980a, pp.234. 
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be: co-operate on Mondays, and defect on other days of the week. The 
game-theorist who conducts this competition observes that player A's 
strategy is the most successful on Mondays, and the least successful 
on other days of the week. This observation is independent of the 
rationality, or irrationality, of the players. 

Another significant feature of Axelrod's experiments now bears 
mention. Axelrod regulated three of four key environmental factors 
(the payoffs to the players, the number of iterations ina game, and 
the players’ knowledge of this number). He made no initial attempt to 
regulate the actual strategic types in competition. His strategic 
population can thus be labelled the “wild” variety. Note that this 
"wildness" led toa spate of late-game defections in the first 
tournament. In order to counteract that phenomenon, Axelrod replaced 
the fixed number of moves per game (in the first tournament) with a 
probabilistic number of moves per game (in the second tournament), 
and thus exerted indirect influence on the late-game behaviour of the 
second field of entries. The strategic population itself, however, 
remained of the "wild" variety, since Axelrod regulated neither the 
inclusion nor exclusion of particular strategic types in or from 
competition. 

Now suppose a complementary experiment were conducted. The 
alternative to a "wild" population is, of course, a "domesticated" 
one. If the strategic types in competition are themselves regulated, 
the experimentor then exercises fuller control over the tournament 
environment. "Domesticated" strategies can be "bred" which incor- 
porate, or lack, virtually any combination of niceness, provocabi- 
lity, forgiveness, and exploitiveness, among other qualities. And 
certain "wild" strategies, selected for their robustness, can be 
maintained "in captivity", and induced to compete against the "dome— 
sticated" strategies. 

It was hypothesized that such an experiment, featuring competi- 
tion among "wild" robust strategies and "domesticated" strategies of 
unknown robustness, would provide a complementary basis for com- 
parison with Axelrod's results and conclusions. In order to test the 
hypothesis, an "interactive" tournament was conducted; so named 
because it involves interaction between "domesticated" and "wild" 
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strategies, and also because all possible sub-tournaments (i.e. all 
possible combinations of strategies) of the main tournament are 
considered. This enquiry now proceeds to describe, and to discuss 
some results of; several facets of the interactive experiment. 

As inthe case of Axelrod's tournaments, the interactive 
tournament can be described in terms of four environmental factors. 
First, its payoff structure is identical to that of game 7.2. Second, 
the number of moves in each game is fixed and constant at one thou- 
sand. This number of moves allows slowly-developing strategies to 
attain their optimal performance levels (and does not affect the 
performance of quickly-developing strategies). Third, all strategies 
have the property of "integrity"; that is, each strategy adheres to 
its normal decision rule for the full one thousand moves per game. No 
strategy deviates from its normal decision rule by making late-game 
defections. Fourth, the twenty competing strategies are grouped into 
"families". The members of each strategic family share distinguishing 
characteristics. 

The five families in the interactive tournament, and their 
respective members' acronyms and decision rules, are as follows: 


(I) The Probabilistic Family 

Members of this family co-operate and defect randomly, accord- 
ing to their individual probabilistic weightings. The two pure 
strategies (pure co-operation and pure defection) are included in 
this family because their program structure is identical to that of 
the other members. The members’ decision rules thus differ by a sole 
parameter; namely, the probability of co-operation on a given move. 
This is the only family in the tournament whose members make their 
moves without taking their opponent's moves into account. 

(a) DDD: This is the strategy of pure defection. On every move, 
DDD co-operates with a probability of zero, and defects witha 
probability of unity. 

(b) TD: This is the strategy of three-quarter random defec- 
tion. On every move, 7JQD co-operates with a probability of 1/4, and 
defects with a probability of 3/4. 
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(c) RAN: This is the strategy of random equiprobability. On 
every move, RAN co-operates or defects with a probability of 1/2. 

(d) DC: This is the strategy of three-quarter random co- 
operation. On every move, MC co-operates with a probability of 3/4, 
and defects with a probability of 1/4. 

(e) CCC: This is the strategy of pure co-operation. On every 
move, CCC co-operates with a probability of unity, and defects with a 
probability of zero. 


(II) The Tit-for-Tat Family 

Members of this family are all related to tit-for-tat, and 
hence share a similar program structure. Small variations in members’ 
decision rules can naturally result in large variations in competi- 
tive performance. 

(a) TFT: Tit-For-Tat is the "founding member" of the family, 
and was the most robust strategy in Axelrod's tournaments. TFT co- 
operates on the first move, and plays next whatever its opponent 
played previously. 

(b) TIT: Tit-for-Two-Tats is less provocable than TFT. TIT 
would have won Axelrod's first tournament (had it competed), but 
fared less well in the second. TIT co-operates on the first two 
moves, and defects only after two consecutive defections by its 
opponent. 

(c) BBE: This strategy attempts to "burn both ends” of the 
strategic candle. It plays exactly as TFT, with one modification: BBE 
responds to an opponent's co-operative move by co-operating with a 
probability of 9/10. BBE thus attempts to out-perform TFT by being 
equally provocable but less reliably forgiving. 

(d) SHU: This is Shubik's strategy, which ranked fifth in 
Axelrod's first tournament. It plays as TFT, with the following 
modification. SHU defects once following an opponent's first defec- 
tion, then co-operates. If the opponent defects on a second occasion 
when SHU co-operates, SHU then defects twice before resuming co- 
operation. After each occasion on which the opponent defects when SHU 
co-operates, SHU increments its retaliatory defections by one. SHU 
thus becomes progressively less forgiving, in direct arithmetic 
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relation to the number of occasions on which SHU's co-operation meets 
with an opponent's defection. 

(e) TAT: Tat-for-Tit is the binary complement of TFT. TAT 
defects on its "first move, then plays next the opposite of whatever 
its opponent played previously. TAT thus defects in response to co- 
operation, and co-operates in response to defection. TAT has been 
bred to exhibit contrariness, and can be thought of as the "bête 
noire" of the TFT family. 


(III) The Maximization Family 

All members of this family maximize expected utilities, but do 
so with different initial probabilistic weightings. Each member plays 
randomly for one hundred moves (co-operating or defecting according 
to its particular weighting), and keeps track of all moves made by 
both itself and its opponent. After one hundred moves, an "event 
matrix" of joint outcome frequencies is used to assign a posteriori 
probabilities in the calculation of expected utilities for the one- 
hundred-and-first move, and all moves thereafter. The generalized 
event matrix takes this form: 


Game 7.3 — Event Matrix for Maximization Strategy versus Opponent 


Opponent 
c d 
C W X 
Maximization 
Strategy 
D Y Z 


where W= number of occasions on which outcome (C,c) obtained 
X = number of occasions on which outcome (C,d) obtained 
Y = number of occasions on which outcome (D,c) obtained 
Z = number of occasions on which outcome (D,d) obtained 
Now, recalling the generalized expressions introduced in 
Chapter Four, the maximization strategy finds its expected utilities 
as follows: 
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EO = p(c/C) (R) + (1-p) (d/C) (9) 
EVD) = (1-p')(c/D)(T) + p'(d/D) (P) 


Note that the maximization strategy avoids the difficulties latent in 
@ priori probability formulations, by the expedient of evaluating a 
posteriori probabilities, or outcome frequencies, directly from the 
event matrix. Hence 


PCO = W(X 
(1-p (dO = X/( WX) 
(1-p)(/D = Y/(¥Z) 
p'(d/D) = 2/(%2) 


while the tournament payoffs are T= 5, R= 3, P= 1, 5= 0. Then 


EU(C) = 3W/ (WX) 
EU(D) = (S¥+Z)/(Y+Z) 


If EC) is greater than or equal to EU(D), the maximization 
strategy co-operates on move one-hundred-and-one; otherwise, it 
defects. The maximization strategy continues to record outcomes 
throughout the game, and thus updates the event matrix after every 
outcome. As the frequency distribution of outcomes changes, the 
maximization strategy's propensity toward co-operation or defection 
also changes accordingly. 

The program structure is identical for every member of this 
family. The critical parameter, in whose value the members differ, is 
the weight accorded to the probability of a member's random co- 
operation during the first one hundred moves. It was hypothesized 
that the properties of the event matrix would be substantially 
affected by a combination of two factors: this initial choice of 
weight, and the type of opposing strategy encountered. Consequently, 
the maximization family was "bred" to represent a range of weights. 

(a) MEU: Maximization of Expected Utility is the familial 
prototype, which appeared in Axelrod's tournaments under the name of 
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its submittor, Downing. Downing ranked tenth among fifteen entries in 
the first tournament; fortieth among sixty-three in the second. 
Downing adopted the principle of insufficient reason, and assigned a 
priori probabilities of 

p(c/O) = p(d/C) = 1/2, and p'(c/D) = p'(d/D) = 1/2 
prior to its first move. It then updated the probabilities according 
the relative frequencies of actual outcomes. Recall that Downing 
would have finished first (in the first tournament) had its initial 
probabilistic outlook been more optimistic. 

MEU randomly co-operates or defects with probability 1/2 for 
the first hundred moves. But in contrast to Downing, MEU assumes 
nothing about the play of its opponent. Instead, MEU notes its 
opponent's choice on each move, and records each joint outcome in the 
event matrix. 

For example, suppose MEU encounters TX (which randomly co- 
operates on every move with probability 3/4, and defects with proba- 
bility 1/4). Then, after one hundred moves, the most probable event 
matrix is as follows: 


Game 7.4 — MEU versus _ MRC, Event Matrix After 100 Moves 


THX 
c d 


38 
D 38 12 


MEU s expected utilities are: 


EU(C) = (38/50)x3 = 2.28 
EU(D) = (38x5 + 12)/50 = 4.04 


Thus MEU defects on its one-hundred-and-first move. 
(b) MAD: This strategy maximizes expected utilities, with 
initial weighting at defection. MAD plays exactly as MEU, except that 
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on each of its first one hundred moves, MAD defects with a probabi- 
lity of 9/10, and co-operates with a probability of 1/10. 

Once again, for example, suppose MAD encounters TX. Now the 
most probable event matrix, after one hundred moves, is: 


Game 7.5 — MAD versus NX, Event Matrix After 100 Moves 


TC 
c d 
7 3 
MAD 
67 23 


MAD's expected utilities are: 


EU(C) = (7/10)x3 = 2.1 
EU(D) = (67x5 + 23)/90 = 3.98 


Thus MAD defects on its one-hundred—and-first move. 

(c) MAE: This strategy maximizes expected utilities, with 
initial weighting at equal expectation. The actual values of this 
weighting are dependent upon the particular payoffs of the game. Most 
generally, if expectations are to be equal, then 


pic/OR+ p /05= ADT + pa DP 


Since MAE makes no a priori assumptions about conditional probabili- 
ties (i.e. makes no assumptions about an opponent's moves), it re- 
expresses this equality in terms of the probability distribution of 
its own moves: 


XR + (1-x)S = (1-xX)T + xP 
where x is the probability that MAE co-operates on each of its first 


one hundred moves. Applying the payoffs of the interactive tourna- 
ment, 
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| 3x = 5(1-3) + x 
x = 5/7 
So MAE plays exactly as MEU, except that on each of its first 
one hundred movés, MAE co-operates with a probability of 5/7, and 
defects with a probability of 2/7. 
As in the preceeding examples, if MAE encounters TRC, the most 
probable event matrix after one hundred moves is: 


Game 7.6 — MAE versus NX, Event Matrix After 100 Moves 
TRC 
c d 


MAE s expected utilities are: 


EUO) = (54/72)x3 = 2.25 
EU(D) = (21x5 + 7)/28 = 4 


Thus MAE defects on its one-hundred-and-first move. 

(d) MAC: This strategy maximizes expected utilities, with 
initial weighting at co-operation. MAC plays exactly as MEU, except 
that on each of its first one hundred moves, MAC co-operates with a 
probability of 9/10, and defects with a probability of 1/10. 

Again, for example, if MAC encounters TX, the most probable 
event matrix, after one hundred moves, is: 
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Game 7.7 — MAC versus TC, Event Matrix After 100 Moves 


: TH 
c d 
67 23 
MAC 
7 3 


MAC's expected utilities are: 


_ EXO = (67/90)x3 = 2.23 
EU(D) = (7x5 + 3)/10 = 3.8 


Thus MAC defects on its one-hundred-and-first move. 

These examples (games 7.4 — 7.7) are presented to illustrate 
the calculi of the maximization family, whose members employ fairly 
sophisticated decision rules. Although, in each of these examples, 
the expected utility of defection on the one-hundred-and-first move 
is greater than that of co-operation, one can see that the distribu- 
tions of joint outcomes differ radically in the four event matrices. 
Since the opposing strategy has been held constant, and its play is 
insensitive to that of its opponents, the differences in distribu- 
tions result solely from the different initial weights accorded to 
the members of the maximization family. 


(IV) The Optimization Family 
Unlike the preceeding strategic families, members of the 
optimization family are related neither by common program structures 
nor by variations on a common decision rule. The attribute shared by 
this family's members is their demonstrated success in previous 
competition(s), achieved by implementing decision rules which attempt 
to optimize future outcomes in light of past ones. 
(a) NYD: This is Nydegger's strategy. It ranked third in 
Axelrod's first tournament, and thirty-first in the second. NYD is 
succinctly described by Axelrod: 
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"The program begins. with tit for tat for the first three 
moves, except that if it was the only one to cooperate on 
the first move and the only one to defect on the second 
move, it defects on the third move. After the third move, 
its choice is determined from the 3 preceeding outcomes 
in the following manner. Let A be the sum formed by 
counting the other's defection as 2 points and one's own 
as 1 point, and giving weights of 16, 4 and 1 to the 
preceeding three moves in chronological order. The choice 
‘can be described as defecting only when A equals 1, 6, 7, 
17, 22, 23, 26, 29, 30, 31, 33, 38, 39, 45, 49, 54, 55, 
58, or 61. Thus if all three preceeding moves are mutual 
defection, A = 63 and the rule cooperates. This rule was 
designed for use in laboratory experiments as a stooge 
which hada memory and appeared to be, trustworthy, 
potentially cooperative, but not gullible.” 


(b) GRO: This is Grofman's strategy. It ranked fourth in 
Axelrod's first tournament, and twenty-eighth in the second. GRO co- 
operates on the first move. After that, GRO cooperates with probabi- 
lity 2/7 following a dissimilar joint outcome [either (Cd) or 
(D,c)], and always co-operates following a similar joint outcome 
[either (C,c) or (D,d)}. 

(c) CHA: This is Champion's strategy. It ranked second in 
Axelrod's second tournament. CHA co-operates on the first ten moves, 
and plays tit-for-tat on the next fifteen moves. From move twenty-six 
onward, CHA co-operates unless all of the following conditions are 
true: the opponent defected on the previous move, the opponent's 
frequency of co-operation is less than 60%, and the random number 
between zero and one is greater than the opponent's frequency of co- 
operation. 

(d) ETH: This is Eatherly's strategy. It ranked fourteenth in 
Axelrod's second tournament, but proved quite robust in a tournament 
conducted privately by Eatherly himself.” As Axelrod observes, ETH 
is an elegant rule.” ETH co-operates on the first move, and keeps a 
record of its opponent's moves. If its opponent defects, ETH then 
defects with a probability equal to the relative frequency of the 
opponent's defections. 


3 tid, 1980a. p.22. 


É Idem., 1980b, p.392. 


à 1pid. 
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(V) The Hybrid Family 

The members of this family share the common attribute that 
their decision rules, as implied by the family name, are formed by 
the hybridization of other strategic pairs. This family consists of 
one "pure" hybrid (bred from two pure strategies), and one "mixed" 
hybrid (bred from two mixed strategies). 

© (a) FRI: This is Friedman's strategy, which ranked seventh in 
Axelrod's first tournament. FRI co-operates until its opponent 
defects, after which FRI defects for the rest of the game. Hence FRI 
is both nice and provocable, but completely unforgiving. Its proper— 
ties in other contexts are elsewhere discussea.4 In the context of 
the interactive tournament, FRI is interesting because its sequence 
of choices consists either in a string that is identical to CCC, or 
else in a string that is identical to CCC up to some move, and 
identical to DDD thereafter. Thus FRI is a pure strategic hybrid. 

(b) TES: This is a strategy called "Tester", submitted by 
Gladstein. TES finished only forty-sixth in Axelrod's second tourna- 
ment, but proved highly adept at exploiting potentially successful 
strategies,” thus compromising their would-be robustness. TES de- 
fects on the first move. If its opponent ever defects, TES "apolo- 
gizes" by co-operating, and plays tit-for-tat thereafter. Until its 
opponent defects, TES defects with the maximum possible relative 
frequency that is less than 1/2, not counting its first defection. In 
other words, 


"This means that until the other player defects, TESTER 
defects on the first moye, the fourth move, and every 
second move after that." 


TES appears somewhat "opportunistic" in character. On the one 
hand, it attempts to exploit co-operative strategies, without being 


excessively provocative. On the other, it attempts to appease provoc- 


a E.g. see R. Harris, “Note on "optimal" policies for the 
Prisoner's Dilemma’, Psychological Review, 76, 1969, pp.373-5; & J. 
Friedman, “A Non-Cooperative Equilibrium for Supergames', Review of 
Economic Studies, 38, 1971, pp.1-12. 


O See Axelrod, 1980», pp.391-3. 


À Ibid, p.391. 
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able strategies, while retaining its capacity to retaliate. In sum, 
TES incorporates two mixed strategies: defection with relative 
frequency up to one-half, and TFT. Thus TES is a mixed strategic 
hybrid. Å s 

This completes a description of the twenty competing strategies 
in the interactive tournament, and their classification by common 
characteristics. It should be stressed that the familial organization 
employed herein is far from unique; any such collection of strategies 
can be grouped in a large number of ways. 

One might group the strategies according to other attributes. 
For instance, CCC and DDD are pure; the others, mixed. But this 
distinction does not yield further information about the mixed group. 

One might choose niceness (the property of never being the 
first to defect) as a criterion of distinction. CCC, TFT, TIT, SHU, 
NYD, GRO, CHA, EIH, and FRI are nice strategies; whereas DDD, TAT, 
and TES, can be termed rude strategies (where rudeness is the proper- 
ty of always being the first to defect). This leaves TID, RAN, TX, 
BRE, MEU, MAD, MAE, and MAC unqualified, for these strategies are 
neither nice nor rude. They might (after Goodman), be assigned the 
predicate of nide. 

One might classify a strategy in terms of its functional 
calculus: a strategy either employs a probabilistic component at some 
stage, or utilizes a fully deterministic decision rule. TID, RAN, and 
MC are wholly probabilistic; TFT, TIT, SHU, TAT, NYD, TES, and FRI 
are wholly deterministic; ABE, GRO, CHA, and ETH are partly probabi- 
listic and partly deterministic; MEU, MAD, MAE, and MAC are sequen- 
tially probabilistic and deterministic; while DDD and CCC are pre- 
determined. But differences between the members of some of these 
groups seem to outweigh their respective common attribute. 

Similarly, any system of classification is bound to admit of 
shortcomings. Given the potential variety of strategies in a collec- 
tion, it seems difficult to develop a uniquely rigorous taxonomy. 
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But, given also the infinity of possible strategies in the 
iterated Prisoner's Dilemma, `! and the concomitantly infinite ratio 
of strategies to choices, it does seem reasonable to associate these 
strategies .in -families which distinguish their overall program 
structures, or conceptual functions. Then, although a given family 
may contain aninfinite number of members, one of three cases obtains. 

In the first case, which associates identical program struc- 
tures, the members are parametrically related, which is a very close 
relation indeed. In such families (the random family and the maxi- 
mization family), members differ only by the value of a single 
parameter. 

In the second case, which associates similar conceptual func- 
tions, the members are related either as variations on a common 
decision rule (the tit-for-tat family) or as multiple expressions of 
a common decision principle (the optimization family). 

In the third case, which associates compounded strategies, the 
members are related by their capabilities of entering two (or com 
ceivably more) distinct decision paths. This is the hybrid family. 

A fourth case also exists, whose members belong to the meta- 
strategic family. In the context of the iterated Prisoner's Dilemma, 
an ideal meta-strategy would attempt to ascertain the identities of 
the other strategies against which it competes, so that it might 
adopt the most effective decision rule against each individual 
opponent. In Axelrod's second tournament, a meta-strategy (that 
tested for both random and highly non-co-operative opponents) ranked 
third overall.” Iterated meta-strategic considerations, while of 
undoubted complexity and interest, are deemed to lie beyond the scope 
of this enquiry. 

It was decided not to mingle meta-strategies with strategies in 
the interactive tournament, because the development of effective 


Å Any strategy which employs a probabilistic component in its 
calculus assigns x and (1-x), where O < x< 1, as a probability 
distribution over the choices. Since x can take on an infinite number 
of real values, then any such strategy admits of an infinite number 
of possible variations. 


2 Axelrod, 1980b, p.395. 
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strategies is logically prior to that of effective meta-strategies. 
The latter makes use of the former. When more is learned about 
effective strategic choice in the iterated Prisoner’s Dilemma, 
effective meta-strategies can be developed. Meanwhile, one can posit 
the existence of a family of meta-strategies, without actually 
describing any of its members. 

In sum, while the five strategic families do not constitute a 
rigorous or exhaustive system of classification, they are useful as 
heuristic aids in a controlled experiment. A tournament whose popu- 
lation is of the wild variety has no express need of such groupings; 
as in Axelrod's experiments, the idea is to observe the competition 
of an unregulated population, and to see which strategies are 
successful ina "free-for-all" environment. Ina tournament whose 
population is of the domesticated and captive varieties, however, 
these familial groupings allow the observation of the relative 
success of various strategic shadings, whether across the spectrum 
of a single parameter in a common program structure, or in terms of 
conceivable variations on a common functional theme. 

The results of the interactive tournament are presented and 
discussed in several chapters to follow. For rapid reference, a 
glossary of strategic families and acronyms can be found in Appendix 
One (pages 258-260). A table of raw tournament scores is given in 
Appendix Two (page 261). Other pertinent data, to be discussed in 
the next chapter, is tabled in Appendix Three. 

The main tournament, which pits twenty strategies against one 
another (and themselves), consists of one-hundred-—and-sixty-five 
(fairly short) computer programs. This number is the difference 
between the two-hundred-and-ten programs theoretically necessary for 
a twenty-by-twenty competition (such that each strategy meets every 
other strategy plus its twin), and the forty-five programs in the 
nine-by-nine sub-competition of nice strategies (again, such that 
each strategy meets every other strategy plus its twin). Sincea 
pair of nice strategies commences with and never deviates from 
mutual co-operation, their game scores are predictable. Thus forty- 
five programs did not have to be written (although some were written 


anyway, in order to verify the program logic of certain strategies.) 
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Combinatoric sub-tournaments, ecological scenarios, and maxi- 
mization family analyses consist of fewer programs of greater length 
and complexity. 

All programs are written in GW-BASIC, and documented samples 
are listed in Appendix Four (pages 271-290). 
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Chapter Eight . 
Analysis of Sub-Tournaments 


Let the results of the main tournament be considered first; 
those of the other sub-tournaments, afterward. (One can draw an 
analogy with set theory, and consider the main tournament as a proper 
sub-tournament of itself.) The overall results are tabled as follows: 


Table 8.1 — Main Tournament, Ranks and Scores 


Rank Strategy Points Average Points Average Rank 
(Offence) Scored Score Allowed Allowed (Defence) 
1 MAC 52901 2645 32054 1603 6 
2 MAE 50058 2503 26891 1345 4 
3 SHU 49844 2492 39699 1985 9 
4 FRI 48823 2441 35403 1770 7 
5 CHA 48719 2436 55874 2794 16 
6 ETH 48484 2424 55270 2764 14 
7 MEU 47235 2362 22607 1130 3 
8 IFT 47210 2361 47240 2362 11 
9 TES 46804 2340 41789 2089 10 
10 TIT 45927 2296 54057 2703 13 
11 BEE 42688 2134 37343 1867 8 
12 GRO 42424 2121 60594 3030 17 
13 TD 41787 2089 31267 1563 5 
14 MAD 41717 2086 15274 764 2 
15 DDD 40024 2001 14994 750 1 
16 RAN 40007 2000 47291 2365 12 
17 TAT 38676 1934 55636 2782 15 
18 NYD 37803 1890 72783 3639 19 
19 IRC 37047 1852 62922 3146 18 
20 CC 36486 1824 75676 3784 20 


The winner of the main tournament, by a comfortable margin, is 
MAC. MAC is the most co-operatively weighted member of the maximiza- 
tion family. Second place, by a narrower margin, goes to MAC's 
closest relative, MAE. Third place is taken by the least-forgiving 
member of the tit-for-tat family, SHU. Fourth place belongs to the 
pure hybrid, FRI. Fifth and sixth places are occupied by members of 
the optimization family, CHA and ETH. 

The upper twelve places are taken by members of four of the 
five competing families. No member of the probabilistic family 
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finished higher than thirteenth place; and two of its members, TX 
and CCC, finished nineteenth and twentieth respectively. 

The average scores (per game) are distributed within the 
following limits. The length of a game is 1000 moves. Hence, the 
maximum achievable score in any game is 5000 points: the minimum, 
Zero points. These extrema occur if one strategy defects 1000 times, 
while its opponent co-operates 1000 times. This extreme situation 
actually obtained in two cases: DDD and TAT both scored maximum 
points against CCC, which went scoreless against both. While these 
two dismal outings by CC contributed to its last-place finish, 
neither of the two strategies that exploited CC to the limit fared 
much better than their victim overall. 

A useful bench . mark is the 3000 point level, attained by both 
members of any strategic pair that practices mutual co-operation for 
an entire game. This occurred on eighty-one occasions, in all pos- 
sible encounters between nice strategies (CCC, IFT, TIT, SHU, NYD, 
GRO, CHA, ETH, and FRI). Owing to the mixture of nice, rude, and nide 
strategies in the population, no strategy—nice or otherwise—was 
able to maintain an average score of 3000 points. MAC and MAE, which 
fared best with respective averages of 2645 and 2503 points per game, 
are neither nice nor rude, but nide. SHU, the best of the nice 
strategies, managed an average of 2492. 

The tournament is, of course, zero-sum with respect to total 
points scored and total points allowed. The former is the sum of the 
sums of the rows of the raw score matrix (see Appendix Two); the 
latter, the sum of the sums of the columns. Although total points 
scored equal total points allowed (for all strategies combined), the 
distributions of points scored and points allowed are obviously quite 
different. Although no strategy averaged more than 2645 points scored 
per game, several strategies allowed, on average, in excess of three 
thousand points per game to be scored against them. 

With respect to points allowed, CCC, MIC, NYD and GRO surpassed 
the 3000 point bench mark. On this side of the ledger, the accomp- 
lishment is of dubious merit. It indicates that these four strategies 
are the most exploitable. When the average score allowed by a strat- 
egy exceeds 3000 points per game, which is the level of constant 
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mutual co-operation, then that strategy is being exploited by op- 
ponents which regularly defect while the strategy itself continues to 
co-operate. An exploitable strategy lacks the quality of provocabil- 
ity. In game-theoretic terms, an unprovocable strategy invites and 
encourages exploitation from all exploitive strategies in its em 
vironment. 

Overall, in table 8.1, there does not appear to be a strong 
correlation between points scored and points allowed. The three 
strategies that allowed the fewest points (DDD, MAD and MEU respec- 
tively) ranked fifteenth, fourteenth and seventh (respectively) in 
points scored. Thus, extreme stinginess on defence did not conduce to 
copious success on offence. As well, one notes that the three most 
exploitable strategies (CCC, NYD, and CM respectively) also fared 
worst in points scored, though not in that order. Then again, the 
fourth highly exploitable strategy, GRO, scored enough points to 
finish twelfth. Thus, extreme generosity on defence did not neces- 
sarily conduce to copious failure on offence. 

Correlations (and lack thereof) can be better observed in table 
8.2, in which strategies are ranked not only according to their 
relative offensive and defensive performances, but also according to 
their relative differences between points scored and points allowed. 

Table 8.2 illustrates that the relative differences between 
average points scored and allowed correlate fairly strongly with 


relative average points allowed.! 


But poor overall correlation 
obtains between relative average points scored and relative average 
points allowed.’ The upper four strategies all had more points scored 
than allowed; the lower five, fewer. But from ranks five through 
fifteen inclusive, there appears to be no correlation between offen- 
Sive and defensive performance. 

Indeed, four of the top ten strategies (CHA, ETH, TFT and TT) 


were out-scored, on average, by their opponents. But crude averages 


A linear regression program (least-squares method) computes 
their co-efficient of determination ("goodness of fit") as 0.952. 


2 
0.187. 


Their co-efficient of determination is similarly computed as 
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can be misleading. The relative success of these strategies lies in 
the precise distribution and magnitudes of their individual scores. 


Table 8.2 — Offensive, Defensive, and Differential Rankings 


Strategy Average Average Difference Rank Rank Rank 
Score Allowed ÀS. - LA. (Offence) (Defence) À.5.- AA. 
MAC 2645 1603 1042 1 6 5 
MAE 2503 1345 1158 2 4 4 
SHU 2492 1985 507 3 9 7 
FRI 2441 1770 471 4 7 8 
CHA 2436 2794 —358 5 16 13 
ETH 2424 2764 -340 6 14 12 
MEU 2362 1130 1232 7 3 3 
TFT 2361 2362 -1 8 11 11 
TES 2340 2089 251 9 10 10 
TIT 2296 2703 —407 10 13 15 
BEE 2134 1867 267 11, 8 9 
GRO 2121 3030 —909 12 17 17 
TD 2089 1563 526 13 5 6 
MAD 2086 764 1322 14 2 1 
DDD 2001 750 1251 15 1 2 
RAN 2000 2365 -365 16 12 14 
TAT 1934 2782 -848 17 15 16 
NYD 1890 3639 —1749 18 19 19 
TC 1852 3146 —1294 19 18 18 
ccc 1824 3784 -1960 20 20 20 


For example, compare the records of FRI and CHA, which ranked 
fourth and fifth respectively. Offensively, FRI out-pointed CHA by a 
mere 5 points per game, on average; while defensively, FRI allowed an 
average of 1024 fewer points per game. Of its twenty tournament 
games, FRI won nine, lost one, and drew ten. CHA, on the other hard, 
won none of its games, lost eleven, and drew nine. Had these tourna- 
ment game results been applied to a meta-tournament, with meta- 
payoffs (assessed by comparing game scores) of two points for a win, 
one point for a draw, and zero points for a loss, then FRI's meta- 
tournament score would be 28 points; CHA's, 9 points. These two meta- 
scores are clearly not in the same proximity as their strategies’ 
offensive ranks. 

After this procedure is applied to all tournament games, the 
strategies can be ranked according to their meta-tournament points: 
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Table 8.3 — Meta-Tournament Rankings 


Rank Strategy Games Wins Losses Draws Points 
1 DDD 20 19 0 1 39 
2 MAD 20 17 2 1 35 
3 MEU 20 15 3 2 32 
3 BEE 20 15 3 2 32 
5 FRI 20 9 1 10 28 
6 MAE’ 20 13 6 1 27 
7 MAC 20 11 8 1 23 
8 TD 20 10 8 2 22 
8 SHU 20 6 4 10 22 

10 RAN 20 9 9 2 20 

11 TAT 20 8 10 2 18 

12 TES 20 5 8 7 17 

13 IX 20 6 13 1 13 

13 IFT 20 0 7 13 13 

15 GRO 20 1 9 10 12 

16 NYD 20 0 10 10 10 

16 ETH 20 0 10 10 10 

18 CHA 20 0 11 9 9 

18 TIT 20 0 11 9 9 

18 CCC 20 0 11 9 9 


Total participations/wins/losses/draws : 400/144/144/112 


Table 8.3 reveals a different dimension of strategic interac— 
tion. With respect to the prior discussion, one finds indeed that FRI 
and CHA, which ranked fourth and fifth in the tournament, now rank 
fifth and eighteenth in the meta-tournament. DDD, which ranked 
fifteenth, now ranks first. In fact, the upper three places in the 
meta-tournament are occupied by the strategies with the three best 
defensive records in the tournament, in that order. These strategies 
previously ranked fifteenth, fourteenth and seventh. 

The maximization family still fares relatively well, occupying 
second, third, sixth and seventh places, but their members' order of 
placement is now reversed. Other curious results appear. For in- 
stance, one sees that SHU, which ranked third in the tournament, won 
only six of its twenty tournament games. And TFT, which placed eighth 
in the tournament, thus fared better than twelve other strategies 
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without winning even a single tournament game. (It achieved the 
highest number of draws, thirteen, which is attributable to its 
mirror-image play.) TFT s untrustworthy relative, BBE, jumped from 
eleventh place in the tournament to a tie for third in the meta- 
tournament. The only strategy to maintain the same rank in both the 
tournament and the meta-tournament is CCC, which finished last with 
unenviable consistency. 

One may well ask: how significant are these meta-tournament 
results? In what way, if any, can they be interpreted relative toa 
given strategy's success, or lack thereof, in the normal tournament? 

From one perspective, it might appear that rankings based upon 
meta-tournament points provide a more accurate reflection of overall 
performance than rankings based solely upon offensive tournament 
points. In game-theoretic terms, the points scored in å tournament 
game are the number of utiles accrued by the given strategy, or, 
equivalently, the net utility to the player who employs the given 
strategy. By the same token, the points allowed in a tournament game 
are the mmber of utiles accrued by the opposing strategy. One must 
now enquire whether a symmetric equivalence obtains. Can the points 
allowed by a given strategy be similarly regarded, as the net dis- 
utility to the player who employs the given strategy? In other words, 
are defensive considerations of any importance? 

In the case of a lop-sided game score, such as 5000 to 0 for 
DDD versus CCC, one might be inclined to regard CCC's points allowed 
as a definite dis-utility to the purely co-operative player. But in 
the case of a mutually co-operative game score, such as 3000 to 3000 
for TFT versus CCC, one is inclined to regard the outcome as mutually 
beneficial. Since 3000 points is far more than any strategy averaged 
throughout the tournament, and is consistently achievable only by 
constant mutual co-operation, it would seem contradictory for player 
Å to celebrate the utility of 3000 points scored against player BÐ, 
while bemoaning the dis-utility of 3000 points allowed to player B. 

Nevertheless, the argument against regarding player B's share 
of a mutually high score as a dis-utility to player A may be inadmis- 
Sible in the context of the interactive tournament. By definition, 


this game-theoretic tournament is concerned with strategic interplay, 
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not with hypothetical players’ motives. One seeks a fair way in which 
to assess the overall performance of the strategies, without presum- 
ing upon the psychology of the players who adopt them. 

A ranking scheme based solely upon points scored does seem 
incomplete, for it disregards the significance of points allowed. On 
the other hand, a ranking scheme based upon meta-tournament points is 
clearly inappropriate, for the simple reason that strategies are both 
defined and designed to compete in a tournament, not a meta-tourna- 
ment. 

It is possible to reconcile the apparent incompleteness of a 
purely offensive ranking scheme for the tournament, as well as the 
interesting result but questionable applicability of a meta-tourna- 
ment ranking scheme, by translating both the tournament and the meta- 
tournament into an allegorical social context. 

Suppose twenty players are to compete in an apple-picking 
contest. The contest format is as follows. All possible player-pairs 
( including "clones'') are to be formed. One pair at a time is sent 
into the orchard, each player in the pair carrying an identical empty 
basket (whose capacity is five thousand applies). Each pair is allowed 
an identical time-period during which its players may accumulate 
apples in their respective baskets. At the expiration of its time- 
period, the pair exits the orchard, the players empty their baskets, 
and their respective numbers of apples are counted and recorded. The 
next pair is then sent into the orchard. It is understood that, after 
all possible pairs will have competed, the player who accumulates the 
greatest number of apples wins the contest. 

The players in a given pair are not prohibited from interfering 
with each other's picking. A player may adopt one of a range of 
strategies, from attempting to maximize his own pickings while 
ignoring the other player, to attempting to minimize the other 
player's pickings while possibly diminishing his own. A "nice" player 
is never the first to interfere with the other player's picking; a 
"rude" player, always the first to do so. A "provocable" player is 
one who responds to interference with interference. A "forgiving" 
player is one who, after having been provoked to interference, also 
desists from interfering after the other player desists. An "exploit- 
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able" player is not provocable. An "“exploitive" player interferes 
with an exploitable one. | l 

Two nice players, when paired, are able to pick about 3000 
apples each during the allotted time. An exploitive player, when 
paired with an exploitable one, is able to pilfer the exploitable 
player's pickings, and thus fills his basket by partly emptying the 
other's. At the extreme of this situation, the highly exploitive 
player emerges from the orchard with 5000 apples; the highly ex- 
ploitable player, with none. By contrast, when two highly exploitive 
players are paired, their mutual interference limits each player's 
pickings to about 1000 apples. 

Several other types of strategies, which reflect different 
mixtures of attributes, are adopted by other players in the competing 
population. It is understood that every player chooses his strategy 
prior to the commencement of play, and that no player alters his 
Strategy during the course of the contest. 

Since the winner of this apple-picking contest is, by defini- 
tion, the player who accumulates the most apples, then the most 
successful strategy in the contest is, ceteris paribus, that strategy 
adopted by the winning player. 

Now let a second apple-picking contest be conducted, which is 
identical to the first in all aspects of play, but whose manner of 
determining the winner differs. After each pair of players emerges 
from the orchard, their baskets are emptied and their respective 
apples are counted, as before. But in this contest, the precise 
apple-count is not recorded. Instead, for every pair, the player with 
the greater apple-count receives two oranges; the player with the 
lesser apple-count, no oranges. If both players in the pair have the 
same apple-count, they each receive one orange. The winner of this 
contest is the player who accumulates the greatest number of oranges. 

Now, to differentiate between the results of the tournament 
ranking scheme (table 8.1) and the meta-tournament ranking scheme 
(table 8.3), one need only ask the allegorical question: are the 
strategies competing for apples, or for oranges? In the first apple— 
picking contest, the most successful strategy is that which yields 
the greatest accumulation of apples to the player who adopts it. In 
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the second apple-picking contest, the most successful strategy is 
that which yields the greatest accumulation of oranges to the player 


who adopts it, by means of yielding the smallest accumulations of 
apples to the players who compete against it. 

The first contest is won by a strategy that seeks to maximize 
its expected gains, without necessarily minimizing the expected gains 
of its competitors. In other words, the first apple-picking contest 
is won by the player who picks the most apples. This seems eminently 
reasonable. The second contest is won by a strategy that seeks only 
to minimize the expected gains of its competitors. In other words, 
the second apple-picking contest is won by the player against whom 
other players pick the fewest apples. But on the whole, the winner of 
the second contest accumulates fewer apples than the majority of the 
other players. Thus, although this winner accumulates the most 
oranges, he fares relatively poorly at accumulating apples. It does 
not seem reasonable that an apple-picking contest be won by a player 
who is a poorer picker than a majority of the other competitors. 

The first contest is an obvious allegory of the interactive 
tournament; the second, of the associated meta-tournament. The 
allegorical social context vindicates the tournament's offensive 
ranking scheme, and illustrates the inappropriateness of the meta- 
tournament ranking scheme. 

There is also a compelling game-theoretic reason why it must do 
so. By definition, the Prisoner's Dilemma is a non-zero-sum game. 
Both the interactive tournament, and the first apple-picking contest, 
are Prisoner's Dilemmas. But the meta-tournament and the second 
apple-picking contest are both constant-sum games, whose constant 
sums are two points, and two oranges, respectively. Hence, contrary 
to appearances, neither example is a Prisoner's Dilemma, amd the 
strategic considerations applicable to Prisoner's Dilemmas do not 
carry over to these examples. 

Why, then, were these examples presented? Because they deli- 
neate a crucial strategic development in the conflicts thus far 
examined; namely, the potential failure of the dominance strategy in 
larger populations of strategies. 
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The dominance strategy, DDD, is simply the dominance principle 
iterated over the entire course of a game. Part Two of this enquiry 
was devoted to an exposition of the unresolved conflict between 
dominance and maximization of expected utility, observed in the 
static, two-person Prisoner's Dilemma. Early in Part Three, it was 
noted (both theoretically and empirically) that the conflict persists 
in iterated two-person games, with a long-term tendency toward one of 
two joint similar outcomes, either (C,c) or (D,d). But now one has an 
indication that dominance reasoning breaks down in the N-pair, two- 
person Prisoner's Dilemma. 

Reconsider the first apple-picking contest, with a competing 
population of only two players. The rules remain the same, with one 
amendment: if both players accumulate 3000 apples, they both win; if 
both accumulate 1000 apples, they both lose. If player X adopts the 
apple-picking strategic equivalent of DDD, then player Y cannot 
possibly pick more apples than X, no matter what strategy Y adopts. 
If both player reason similarly, both will adopt DDD, and both will 
lose. 

But if a third player, Z, enters the competition, the strategic 
balance of power shifts away from DDD. Two nice players will both 
fare better, overall, than a single rude exploitive player, providing 
that the nice players are both provocable. No player, of course, can 
predict what strategies the other two will adopt. Let player X 
contemplate adopting a rude and exploitive strategy. Then X knows 
that if both Y and Z are rude and exploitive, all will lose; if Y is 
rude and exploitive while Z is nice, Z will lose; if both Y and Z are 
nice and provocable, X will lose; if both Y and Zare nide, X will 
lose. 

In general, a rude and exploitive player's a priori chances of 
winning diminish as the strategic population grows and varies. The 
pure dominance strategy cannot guarantee that a competitor will not 
Win, because the strategy cannot dominate all the competitor's 
interactions. A strategy that wins oranges at apple-picking contests 
wins a chimerical victory. Thus player X would be well-advised to 
contemplate the adoption of a strategy that is not rude and exploi- 
tive. What strategy should he adopt? 
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Leaving the apple-orchard, and returning to the interactive 
tournament, the maximization family members MAC, MAE and MEU all 
fared better than DDD. MAC and MAE, in fact, fared better than all 
other strategies. This result marks a turning point in the strategic 
conflict: in the interactive tournament involving twenty strategies, 
pure defection is relegated to a position of relative obscurity, 
whereas co-operative members of the maximization family appear to be 
ascendant. 

But, as noted at the beginning of this chapter, the results in 
table 8.1 represent only one element of a large set of possible sub- 
tournaments. Thus far inthe chapter, two principal results are 
established: first, that a ranking scheme based upon (offensive) 
points scored is appropriate for this type of tournament; second, 
given said ranking schene, MAC is the most successful strategy in the 
main tournament involving twenty strategies. Next, one must ask: how 
robust is MAC in the interactive environment? This question can be 
answered by examining the results of all possible sub-tournaments of 
the main tournament. 

Briefly, this approach can be contrasted with Axelrod's. In 
both of Axelrod's tournaments—as in this interactive tournament—the 
criterion of a strategy's success is the number of points it scores. 
Axelrod does not discuss the relative merits of different ranking 
schemes. He simply assumes that strategies should be ranked in 
ascending order of total points scored, amd his discussions of 
strategic success and robustness are predicated upon that assumption. 
This chapter's comparison of ranking schemes supports Axelrod's 
assumption; moreover, such support emanates from a game—theoretic 
perspective. Hence Axelrod's tournaments and the interactive tourna— 
ment employ the same method for evaluating a given strategy's suc- 
cess. But, with respect to robustness, the methodologies differ. 

IFT won Axelrod's first tournament; however, recall that 
Axelrod describes three other stategies which would have won if 
submitted.” Thus TFT may not have been the most robust strategy in 
that environment. It certainly had potential rivals. Axelrod's second 


3 Axelrod, 1980a, p.20. 
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tournament, also won by TFT, had a much larger competing population, 
and here Axelrod uses a rather elegant method to evaluate TFT s 
robustness. Owing to the unwieldiness of the second tournament's 
matrix of raw scores, which contains 63x63 or 3969 entries, Axelrod 
uses step-wise regression to express the overall performance of any 
strategy in terms of its performance against just five other strate- 
gies, which he calls "representatives": 


"These five rules [i.e. decision rules, or strategies] 
can be thought of as representatives of the full set in 
the sense that the scores a given rule gets with them can 
be used to predict the average score the rule gets over 
the full set." 


Axelrod is able to use these representatives to assess TFT's 
robustness, in the following way. Each of these five strategies can 
be thought of as representing a "constituency" of strategies. A sixth 
constituency is formed by the unrepresented "residuals". Axelrod then 
conducts six hypothetical tournaments, in each of which one of the 
six constituencies, in turn, is enlarged to five times its original 
size by weighting its representative accordingly.” Thus TFT must now 
compete in populations formed by distending six different segments of 
the original strategic distribution. TFT won five of these six 
hypothetical tournaments. Based on this result, Axelrod pronounces 
TFT robust .° 

This enquiry adopts a different methodology, namely that of 
combinatoric analysis. Given a set of n elements, one can combine r 
elements from that set in n!/r!(n-r)! different ways. This operation 
is commonly referred to as "n choose r', or C(n,r). In the interac- 
tive tournament, the number of strategies (or elements) is twenty. 
The twenty strategies can be combined in just one way, since C(20,20) 
= 20!/20!0! = 1. (By definition, the factorial of zero is unity.) The 
results of this single sub-tournament, for r = 20, appear in table 


8.1. But r can assume a range of theoretical values, from 


i Idem., 1980b, p.386. The co-efficient of correlation between 
the scores predicted by the step-wise regression and the actual 
tournament scores is a respectable .979. 


> Ibid, pp.396-398. 
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l1<r< n. In practice, at least two strategies are required for a 
competition to take place, so the value r= 1 is not applicable here. 

Note also that, if a+ b= n, then C(n,a) = C(n,b) J For ex- 
ample, the number of sub-tournaments that can be conducted with 
different combinations of eighteen strategies is the same as the 
number that can be conducted with different combinations of two 
strategies (since 2 + 18 = 20). This number is 190. But if all 190 
combinations of eighteen strategies are formed, each individual 
strategy appears in 171 of these combinations; whereas if all 190 
combinations of two strategies are formed, each individual strategy 
appears in only 19 of these combinations. In general, for C(n,r), 
each individual element appears in 


(r/mxIC(n,r)1, or (1-1) !/(2-1) ! (7-7) I 


different combinations. 

For all applicable values of r, the mumbers of possible sub- 
tournaments and the numbers of appearances of each strategy are given 
in table 8.4, overleaf. 

The total number of possible sub-tournaments that can be 
conducted, from all combinations of strategies for each applicable 
value of r, is 616,666. The total number of sub-tournaments in which 
each strategy competes is 524,287. Thus each strategy competes in 
more than half a million different sub-tournaments, against all 
possible combinations of the other strategies in the population. 

In order to evaluate the results of this large number of sub- 
tournaments, the following procedure is adopted. All sub-tournament 
combinations involving r strategies are conducted, one at a time, for 
each value of r. 
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Table 8.4 — Sub-Tournaments Resulting From 
Combinations of r Strategies 


Value of r Combinatoric formula Nunber of different Number of sub-tournaments 
(Nunber of strategies Gan = tirien. sub-tournazents in which each stratgy ap- 
competing in sub-tourna- pears 

ment) (noxia, 7) 

20 20!/20!0! 1 1 

19 20!/19!1! -~ 20 19 

18 or 2 20!/18!2! 190 171 or 19 

17 or 3 20!/17!3! 1140 969 or 171 

16 or 4 20!/16!4! 4845 3876 or 969 

15 or 5 20!/15!5! 15504 11628 or 3876 
14 or 6 20!/14!6! 38760 27132 or 11628 
13 or 7 20!/13!7! 77520 50388 or 27132 
12 or 6 20!/12!8! 125970 75582 or 50388 
11 or 9 20!/11!9! 167960 92378 or 75582 
10 20!/10!10! 184756 92378 


Let r have a given value. Suppose strategy S; ranks first in 
the first sub-tournament conducted (for that r). Then strategy 5, 
fared better than (r-1) other strategies in that particular combina- 
tion. Hence, strategy S; is awarded (7-1) points. Similarly, if 
strategy 5, ranks second, then strategy S, fared better than (1-2) 
other strategies in that particular combination. Hence, strategy 5; 
is awarded (7-2) points. This procedure is applied to all strategies 
in that sub-tournament combination. In other words, each strategy in 
that particular combination is awarded a number of points, equal to 
the number of strategies it betters. Suppose strategy 5, ranks last. 
Since 5 betters no strategies, it is awarded no points. 

The second sub-tournament combination involving r strategies 
(for the same value of r) is then tried. Once again, points are 
awarded to each strategy appearing in this combination, according to 
the number of other strategies it betters, from (1-1) points for the 
first-ranking strategy to zero points for the last-ranking strategy. 

When a given sub-tournament combination consists of nice 
strategies only, they all achieve identical scores. In such cases, 
when r nice strategies draw, they each receive (1) points. And most 


generally, if any sub-tournament involving r strategies sees pof 
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these strategies tied for g’ place, then each of the p strategies 
receives (1-9) points. 

After C(20,r) different combinations are exhausted for the 
given r, each strategy will have appeared in 19!/(2r-1)!(20-r)! 
different sub-tournaments. In order to determine which strategy is 
most successful for this value of r, the efficiency of each stra- 
tegy's performance is calculated according to the following formula. 
If a strategy wins each and every sub-tournament for this value of r, 
its point-awards would total 


(1-1) x 19!/(1-1) ! (20-7) ! 
or 19!/(1-2) ! (20-r) ! 


This is the maximum number of points awardable to a strategy, for any 
given value of r. The relative efficiency of a strategy, then, is 
simply its actual point-award total divided by this maximum number. 
(The relative efficiency is then multiplied by one hundred for 
expression as an efficiency percentage.) 

A specific example of the entire procedure is tabled overleaf 
(table 8.5), for the twenty different sub-tournaments conducted by 
forming all possible combinations of nineteen strategies. 

For r = 19, there are twenty possible sub-tournaments. Each 
strategy appears in nineteen sub-tournaments, and can be awarded a 
maximum of 18 points in each appearance. Hence, the ideal point-award 
total is 19x18 = 342 total points. 

Since MAC ranked first in all its appearances, it actually 
achieved this ideal; hence, its efficiency is (342/342)x100, or 100%, 
in sub-tournaments involving nineteen strategies. 

MAE ranked second in twelve sub-tournaments; third, in three 
sub-tournaments; fourth, in two sub-tournaments; sixth, in two sub- 
tournaments. Hence, MAE bettered seventeen opponents on twelve 
occasions; sixteen, on three occasions; fifteen, on two occasions; 
and thirteen, on two occasions. This tally accounts for MAE"s nine- 
teen appearances. MAF's relative efficiency is therefore 

[(12x17) + (3x16) + (2x15) + (2x13) ]/342 
= 308/342 = .901 
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Thus MAE is 90.1% efficient in sub-tournaments involving nineteen 
strategies. 


Table 8.5 — 19 Appearances in 20 Sub-Tournaments 
‘Involving 19 Strategies 


Rank: 1 2 3 4 5 6 7 8 § 1 NR BU 516 7 18 19 ER 
MC 000 0 00 0 00 0 00 0 0 00 0 O 100 
ME 0 2 32 0 2 0 0 00 0 00 0 0 00 0 0 941 
År 6 9 1 2 0 0 0 00 0 00 0 0 00 0 O 898 
FRIO 0 6 8 1 3 0 0 1 0 0 00 0 0 0 0 0 O 616 
CA 0 2 06 6 4 1 0 00 0 00 0 0 80 0 0 0 75 
Ho 0 1 1 8 4 4 12 0 0 0 00 0 0 0 0 0 09 74.3 
ARVO 0 1 2 1 5 2 3 1 4 0 0 0 0 0 0 0 0 0 66.7 
IT o 0 0 0 1 2 1 5 1:0 0 00 0 0 00 0 0 6.8 
TES 0 000 140,3 8 70 0 00 0 0 00 0 O 60.8 
To 000 0 00 3 10 6 0 00 0 0 0 0 0 0 547 
BE 0 000 0 0 0 0 04 8 52 0 0 00 0 0 430 
60 0 000 0 0 0 0 03 5 55 1 0 00 0 0 40 
MDO 000 0 0 0 0 903 3 43 5010 0 0 74 
mo 0 0 0 0 0 0 0 0 0 4 69 0 0 0 0 0 0 3%4 
Mo 000 0 0 0 0 00 0 02 7 10 1 0 0 0246 
000 0 000 0 80 8 0 00 0 00 6 140 20 0 4 22 
MTO 000 0 0 0 0 00 0 00 0 0 16 1 2 0 152 
mo 0 0 0 0 0 0 0 00 0 00 2 0 036 2 0 1.4 
Wo 000 0 0 0 0 00 0 00 0 0 0 2 14 2 56 
(co 000 00 0 0 00 0 00 0 0 01 1 17 09 


At the bottom of the list, CCC ranked seventeenth in one sub- 
tournament; eighteenth, in one sub-tournament; nineteenth and last, 
in seventeen sub-tournaments. Hence CCC bettered two opponents on one 
occasion, and one opponent on another occasion. Its relative ef- 
ficiency is therefore 3/342, or 0.009. Thus CCC is only 0.9% effi- 
cient in sub-tournaments involving nineteen strategies. 

In table 8.5, notice that the non-zero entries tend to be 
clustered along the main diagonal of the matrix. This general lack of 
dispersion throughout each row indicates that a given strategy tends 
to achieve the same rank, or else to perform within a narrow range of 
ranks, in each of its appearances. One extreme case is MAC, which 
ranked first in the nineteen sub-tournaments in which it appeared. At 
the other extreme is MEU, whose rankings are distributed across eight 
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consecutive columns. In its nineteen appearances, MEU attained a 


range of ranks between third and tenth places inclusive. 

The average rank-dispersion in table 8.5 (that is, the average 
number of different ranks attained by a given strategy), is 4.6 of å 
possible 19 ranks per strategy. Overall, the actual rank attainments 
are dispersed over less than 25% of the field of possible rank 
attainments. This denotes an expected result; namely, that in the 
twenty sub-tournaments involving different combinations of nineteen 
strategies, the absence of any particular strategy from a given sub- 
tournament does not drastically influence the relative success of the 
remaining competitors. In other words, slight variations in the 
constitution of a large population do not exert a pronounced effect 
on the bulk of its members' performances. 

By the same token, one expects an increased dispersion of 
rankings as the number of strategies per sub-tournament diminishes 
(and the corresponding number of possible combinations increases). 
Consider the distribution of rankings at the next combinatoric stage, 
in table 8.6 (overleaf). 

Although MAC still dominates the standings, it too begins to 
show a dispersion of rank. The average rank-dispersion in table 8.6 
is now 6.75 of a possible 18 ranks per strategy, while actual rank 
attainments are dispersed over 37.5% of the field of possible rank 
attainments. The absence of one additional strategy per sub-tourna- 
ment, and the increased number of combinations resulting therefron, 
give rise to a corresponding increase in variations of strategic 
performance. 
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Table 8.6 — 171 Appearances in 190 Sub-Tournaments 
Involving 18 Strategies 


oki 123 4 5 6 7 8 9 10 


11 12 13 14 15 16 17 18 ER 
AAC 162-7 2 0 0 0 0 0/0 00 0 0 00 0 0 0 96 
SHU 16 56 51 28 14 4 1 1 0 0 0 0 0 00 0 0 0 886 
ME 8 9% 2 18 9 10 7 4 1 00 0 0 00 0 0 0 87.8 
CHA 418 2 46 42 3 6 1 0 0 0 0 0 00 0 0 0 83 
FRI 0 13 58 36 26 17 5 7 9 00 0 0 00 0 0 0 82 
EM 0 5 15 29 54 3% 26 6 0 00 0 0 00 0 0 0 7.4 
MEU 0 0 19 23 18 2 3 17 A 27 2 0 0 00 0 0 0 666 
TT 0 0 0 5 15 4 66 32 7 00 0 0 00 0 0 0 663 
TE 00 1 5 12 20 4 60 3l 0 0 0 0 00 0 0 0 626 
TT 0 0 0 0 0 4 14 58 77 18 0 0 0 00 0 0 0 57 
BE 00 0 0 0 0 0 3 7 MAIR 20 1 0 0 39.8 
68 0 0 0 0 0 D 0 0 4 48 63 2 27 43 0 0 0 37 
MD 00 0 0 0 0 0 80 2 33 24 30 34 103 3 6 0 31 
Mm 0 0 0 0 0 0 0 0 5 1749 5 36 90 0 0 0 3.8 
Ry 0 0 0 0 0 0 0 0 0 0 0 10 30 10229 0 0 0 243 
DDD 0 0 9 0 0 0 0 00 3 9 2 3 NHH 24 9 10 24 
far 0 0 0 0 0 0 0 0 0 00 0 2 AA 23 U 7 88 
AYD 0 0 0 0 0 0 0 Q 0 0 2 1 6 525 96 34 2 128 
W o 0 0 0 0 0 0 0 0 00 0 0 02 34 6 73 47 
cc 0 0 0 0 0 0 0 0 0 00 0 0 06 9 58 98 32 


A final example of this tendency, for the 184,756 combinations 
of ten strategies, is given in table 8.7 (overleaf). 

At this combinatoric level, each strategy appears in 92,378 
sub-tournaments, and is absent from a like number. The average rank- 
dispersion in table 8.7 is now 9.6 of a possible 10 ranks per stra- 
tegy, while actual rank attainments are dispersed over 96% of the 
field of possible rank attainments. The large number of combinations 
of ten strategies allows great variation in relative performance. 
With the exception of NYD, every strategy is able to win at least one 
sub-tournament; most, many more. 

Once again, MAC proves most successful, winning more than half 
the sub-tournaments in which it appears. Its rankings, however, are 
now dispersed over nine of ten places; but MAC ranks ninth (its worst 
performance) in only seven of 92,378 sub-tournaments. In fact, four 
of the upper six strategies (MAC, SHU, CHA, and ETH) never finish 
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last in any of the sub-tournaments in which they appear. By contrast, 


CCC maintains a secure hold on last place: it is the only strategy to 
finish tenth in more than 50% of its appearances. 


Table 8.7 — 92,378 Appearances in 184,756 Sub-Tournaments 


Involving 10 Strategies 


Rank: 1 2 3 4 5 6 7 8 9 10 ER 
MAC 48981 18468 9999 5672 511 3022 949 169 7 0 88.2 
ME 35411 21570 10181 7020 6460 6017 3858 143 39 2 813 
SHU 24361 25207 15892 12170 9008 4307 1248 16 20 0 80.8 
FRI 25997 18180 14466 12839 9383 6998 3150 1101 216 48 7.5 
CHA 17082 16832 21127 19435 13000 4221 67 43 l 0 76.6 
EM 11538 18032 21656 21768 13837 4691 767 89 0 0 74.7 
TES 6724 15254 16875 18740 18540 12157 3396 607 RB 7 81 
IFT 3231 10740 22751 25645 19456 0048 20601 411 3 i 679 
MEU 7461 22688 16227 10020 9994 11019 8339 4572 1758 300 66.4 
HT 1775 6151 11125 16241 22881 21515 9204 2881 57 18 3571 
MD 932 5385 8292 7327 8869 13181 13269 11694 13396 10033 39.3 
GO 222 1071 2353 5227 11132 20304 26684 19762 5587 3 385 
MD 370 1985 4270 6302 9458 16892 22413 17352 10310 3026 37.8 
RE 17 435 1447 4333 11066 18122 19621 16137 11785 9415 32.5 
MO 5 1311 5040 6514 6817 9904 12785 12779 12005 AN 28,3 
RAH 1 406 904 1988 3735 9560 24650 29770 16969 435 27.7 
IAT 455 979 1884 2581 3407 6543 10551 20650 24282 21046 21.9 
MD 0 13 110 532 1637 4132 9526 18168 33868 24372 15.3 
mM 3 8 10 371 680 1827 5777 16309 32065 35060 11.7 
ær t 0 0 27 260 2216 5863 10585 21597 51829 8.5 


The results of all combinations of all sub-tournament groups 
(from two to twenty competitors) are tabled in Appendix Three. It can 
be seen that MAC dominates all group sizes from twenty down to seven 
competitors, inclusive. MAE dominates groups of six and five com- 
petitors, while FRI prevails in groups of four and three. In the 190 
sub-tournaments involving two strategies, wherein each strategy makes 
19 appearances, FRI, SHU and TFT are most efficient. 

These results can be summarized as follows. A total of 616,666 
different sub-tournaments have been conducted, by taking all combina- 
tions of the population of competing strategies, in all group sizes 
from twenty to two competitors. In all, each strategy appears in 
924,287 sub-tournaments (the sum of its appearances in each group 


Size), and the efficiency of each strategy's performance is tabled 
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for each group. A relative measure of robustness can now be made by 
calculating each strategy's overall efficiency across the entire 
range of group sizes. 

A strategy's overall efficiency is simply the weighted average 
of its relative efficiencies in all groups. Suppose a given strategy 
appears in N; sub-tournaments for all combinations C(i,20) of i 
competitors, and attains a relative efficiency of E, in that group. 
Then the given strategy 's overall efficiency, E, is found by 


20 20 
= = 
= på . > A 
E 2 (E) (N) È N, 
j=2 i=2 


(where the denominator = 524,287) 


The results of this calculation, for all strategies, appear in table 
8.8. 


Table 8.8 — Overall Efficiencies: 524,287 Appearances 
in 616,666 Sub-Tournaments 


Siz: 2 3 4 5 6 7 8 9 HU 12 13 14 15 16 17 %8 19 2 B 


52.6 61.7 68.7 74.0 78.0 81.3 83.9 86.2 88.2 90.1 91.8 93.6 95.2 96.6 97.9 98.9 99.6 100 100 88.8 
78.9 78.9 79.5 79.3 79.5 79.8 80.3 80.8 81.3 81.8 82.4 83.0 83.6 84.3 85.1 86.1 87.8 90.1 90.0 81.6 
100 84.5 78.2 76.3 76.3 77.2 78.4 79.6 80.8 81.8 82.8 83.6 84.5 85.4 86.4 87.4 88.6 89.8 85.0 81.2 
100 87.4 81.3 78.5 77.5 77.2 77.3 77.4 77.5 77.7 77.8 78.0 78.3 78.6 78.9 79.4 80.2 81.6 80.0 77.7 
73.7 67.3 67.3 69.1 71.2 73.0 74.4 75.6 76.6 77.4 78.1 78.8 79.4 80.0 80.4 80.5 80.3 79.5 75.0 76.7 
73.7 71.6 69.6 70.0 71.0 72.1 73.1 73.9 74.7 75.3 75.7 76.1 76.3 76.4 76.3 76.0 75.4 74.3 70.0 74.7 
100 80.4 74.3 70.7 68.9 68.2 67.9 67.9 67.9 68.0 67.9 67.8 67.5 67.2 66.8 66.5 66.3 65.8 60.0 67.9 
68.4 74.6 73.3 71.2 70.0 69.4 68.9 68.5 68.1 67.7 67.3 66.8 66.2 65.7 64.9 64.0 62.6 60.8 55.0 67.8 
63.2 68.7 68.1 67.5 57.1 66.7 66.5 66.4 66.4 66.5 66.5 66.6 66.7 66.7 66.7 66.6 66.6 66.7 65.0 66.5 
78.9 64.9 59.4 57.7 57.0 57.0 57.1 57.1 57.1 57.1 57.0 56.9 56.8 56.6 56.5 56.2 55.7 54.7 50.0 57.0 
42.1 42.4 41.1 41.8 40.5 40.4 40.1 39.3 39.3 38.7 38.2 38.2 37.4 37.3 37.1 36.3 37.7 36.5 30.0 38.9 
63.2 49.1 42.8 39.9 40.8 40.2 39.2 39.0 38.5 38.2 38.0 37.5 37.2 37.4 37.4 38.5 39.7 40.1 40.0 38.4 
36.8 41.8 40.6 39.9 39.4 39.0 38.5 38.0 37.8 37.4 37.2 37.0 36.8 36.8 36.9 36.8 36.8 37.4 35.0 37.7 
5.3 13.5 20.4 24.9 27.9 29.6 30.9 31.9 32.5 33.3 34.2 35.1 36.1 37.1 28.0 38.7 39.8 43.0 45.0 32.9 
42.1 36.8 36.8 35.1 32.8 31.8 30.5 29.3 28.3 27.5 26.5 25.7 25.3 24.0 23.5 23.7 22.4 22.2 25.0 28.1 
42.1 37.1 35.2 33.6 32.1 30.6 29.5 28.5 27.7 27.0 26.2 25.7 25.4 24.9 25.0 25.3 24.3 24.6 20.0 27.6 
42.1 31.6 30.6 29.0 27.2 25.6 24.3 23.0 21.8 20.8 19.8 18.8 18.1 17.5 16.7 15.7 15.8 15.2 15.0 21.5 
52.6 30.4 21.6 18.6 17.2 16.5 16.2 15.7 15.3 15.0 14.6 14.4 14.1 13.8 13.4 12.9 12.8 11.4 10.0 15.2 
31.6 28.4 21.8 19.3 16.7 14.4 13.5 12.8 11.7 11.1 10.6 9.6 8.7 8.0 6.8 6.1 4.7 5.6 5.0 11.6 
47.4 25.7 18.0 13.4 11.9 10.9 9.8 9.2 8.5 7.9 7.4 6.9 6.5 5.8 5.3 4.1 3.2 0.9 0.0 8.4 
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The overall efficiencies in table 8.8 can be fairly said to 
- represent the relative robustness of the strategies. MAC is clearly 
the most robust strategy in the population of the interactive tourna- 
ment. MAC's "sibling" strategy, MAE, is the next most robust, fol- 
lowed closely by SHU, the least forgiving member of the tit-for-tat 
family. 

Comparing the standings in tables 8.8 (overall efficiencies) 
and 8.1 (main tournament results). it seems significant that the 
upper six and lower six strategies maintain identical ranks in both 
cases. Given that table 8.1 is the result of the unique sub-tourna- 
ment involving the single combination of twenty strategies, and that 
table 8.8 is the weighted result of 616,666 different sub-tournaments 
involving all combinations of all groups, then the upper and lower 
third of the compiled standings of more than six hundred thousand 
sub-tournaments are "determined", as it were, by the unique outcome 
featuring the largest group. It is a matter of speculation whether 
such determination would obtain anew, and to what degree, in dif- 
ferent initial strategic populations. 

One concludes the combinatoric analysis of sub-tournaments with 
a graph that illustrates how the efficiencies of the upper six 
strategies change as a function of group size: 


Graph 8.1 MOST EFFICIENT COMBINATORIC PERFORMANCES 
All Sub-Tournaments, 2-19 Competitors 


Efficiency 
% 


2 3 4 5 6 7 8 9 10 11 12 13 14 15 18 17 18 19 
Number of Competitors 
— MAG — MAE == SHU & FRI  CHA œ ETH 
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MAC and MAE are the sole top strategies whose efficiencies 
increase uniformly with the size of the competing group. SHU and FRI, 
which rank third and fourth respectively, do so because their ef- 
ficiencies increase after falling off sharply in smaller groups. (HA 
and ETH, which rank fifth and sixth respectively, experience a less 
sharp early decrease in smaller groups, a gradual increase in mid- 
sized groups, and a gradual falling-off in larger groups. 

MAC, whose efficiency is lowest among the six top strategies at 
group sizes of two and three, experiences a much sharper rate of 
increase than MAE. Moreover, MAC continues to increase more sharply 
than MAE, SHU and FRI, even after assuming the lead at the group size 
of seven. The larger the competing population, the better MAC per- 
forms, relative both to its om increasing efficiency, and to the 
efficiencies of its competitiors. 

That MAC and MAE are the most robust strategies in the popula- 
tion of the interactive tournament, is a matter that requires further 
investigation in the context of this enquiry. MAC and MAE are the two 
most closely-related, and most co-operatively weighted (in order of 
rank), members of the maximization family. The final part of this 
enquiry attempts to account for their success, both in terms of their 
relatedness and co-operativeness. 

Before that attempt is made, however, the strategic population 
is subjected to a different measure of robustness; namely, an eco 
logical scenario. 


8 Recall that, during its first one hundred random moves, MAC 
co-operates with a probability of 9/10; MAE, 5/7; MEU, 1/2: MAD, 
1/10. 
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Chapter Nine 
An Ecological Scenario 


The ecological scenario emerges from a consideration of evolu- 
tionary game theory, which itself developed from an application of 
game-theoretic concepts to certain types of conflicts in the sphere 
of biological evolution.! A distinction must be drawn, however, 
between Maynard Smith's evolutionary game-theoretic model and Axel- 
rod's ecological scenario. It can be shown that Axelrod's tourna- 
ments, and the interactive tournament, are not susceptible to evolu- 
tionary modelling in the Maynard Smith sense. The ecological scena- 
rio, however, provides an interesting alternative perspective on 
strategic robustness. 

The classic Maynard Smith evolutionary game models con-specific 
conflicts inthe animal kingdom exclusive of humans. Essentially, 
Maynard Smith hypothesizes that if two members of a species compete 
for a fitness-enhancing resource of expected utility V, each member 
may adopt either the "hawk" strategy (H), which consists in monopoli- 
zing the resource, or the dove strategy (D. which consists in 
sharing it. If both competitors adopt the "hawk" strategy, then a 
mutually-injurious conflict ensues, which reduces their fitnesses by 
a palpable quantity C. The game matrix (9.1) follows.’ 

[Note that in game 9.1, Ddenotes the "dove" strategy and C 
denotes the injurious effect of an (HH conflict. This is the 
familiar notation for the evolutionary model, amd these symbols 
should not be confused with their signification in the Prisoner's 
Dilemma. ] 


l Lewontin seems to have been the first to conceive of a literal 
game against nature, in applying the minimax criterion to population 
genetics. See R. Lewontin, “Evolution and the Theory of Games', 
Journal of Theoretical Biology, 1, 1961, pp.382-403. 


2 E.g. see J. Maynard Smith, “The Theory of Games and the 
Evolution of Animal Conflicts', Journal of Theoretical Biology, 47, 
1974, pp.209-21. 


3 Idem., Evolution and the Theory of Games, Cambridge at the 
University Press, 1982, p.12. 
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Game 9.1 — The Maynard Smith Evolutionary Model 


B 
H D 
H (1/2) (0) V,0 
É D 0,V (1/2) V 


The payoffs of game 9.1 can take on three different transitive 
orderings, depending upon the relative values of Vand C. Explicitly, 
the three cases are: ` 

case (1): VOC 

case (2): V=C 

case (3): V< C 
Let each of these cases be considered in turn. 

In case (1), V> C. In other words, the fitness enhancement 
resulting from possession of the resource is greater than the fitness 
reduction resulting from the conflict over its acquisition. To both 
competitors, then, the expected utility of the (H,H) outcome is 
greater than zero. To either competitor, the "hawk" strategy is 
strongly dominant, since (1/2)(-FC) > O and V > (1/2)V. Either 
competitor's fitness is enhanced by his adoption of the "hawk" 
strategy, no matter what his opponent does. If both competitors adopt 
the "hawk" strategy, their fitnesses are enhanced by (1/2)(KQ: 
whereas, if both adopt the "dove" strategy, their fitnesses are 
enhanced by (1/2)V. Although monopolization is strongly dominant, 
both competitors (if they play alike) gain more by sharing. Thus, 
this case of game 9.1 is a Prisoner's Dilemma (with transitive 
ordering of payoffs T> R>P>9). 

In case (2), V= C. To both competitors, the expected utility 
of possessing the resource is just balanced by the expected dis- 
utility of acquiring it. Then, to competitor A, the "hawk" strategy 
is weakly dominant, since his payoff of outcome (H,H) equals that of 
outcome (D,H) [equals zero], and his payoff of outcome (H,D) is 
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greater than that of outcome (D,D {since V > (1/2)V). Similarly, 
from competitor B's point of view, the "hawk" strategy weakly domina- 
tes the "dove" strategy. Hence monopolization is weakly dominant. 
However, the competitors gain nothing if both adopt the "hawk" 
strategy, while each gains (1/2)V if both adopt the "dove" strategy. 
Thus, this case of game 9.1 is a weak Prisoner's Dilemma (with 
transitive ordering of payoffs T> R> P= 5). 

In cases (1) and (2), where V 2 C, the pure "hawk" strategy 
appears to prevail in nature.’ It is not difficult to understand why 
it prevails in these cases. Maynard Smith's evolutionary game theory 
models con-specific conflicts in the neo-Darwinian paradigm. In neo- 
Darwinian terms, the "hawk" and “dove” strategies are phenotypic 
behaviour patterns mediated by genotypic attributes. Natural selec- 
tion acts upon the individual at the phenotypic level, thereby 
indirectly favouring, or disfavouring, the genotypes that mediate a 
given behavioural pattern. Å significant component of an animal's 
inclusive fitness is its ability to reproduce. Thus, an increase in 
fitness implies an increase in potential reproductivity. 

Now, for V2 C, suppose genome A mediates the phenotypic 
behaviour of pure "hawk"; genome Ð, that of pure "dove". Enounters 
between animals carrying these genomes are represented in the follow- 
ing matrix: 


Game 9.2 — Encounters Between Pure Strategies 


A B 
A (1/2) (HC) V,O 
B 0,V (1/2) V 


While the outcomes of games 9.1 and 9.2 are identical, the 
players are not. Game 9.1 models a conflict between two members of a 
species; game 9.2, all conflicts within the population at large. An 
animal carrying genome A enhances its fitness regardless of which 


1 Ibia, p.15. 
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type of con-specific it encounters; whereas an animal carrying genome 
B enhances its fitness only when it encounters a con-specific carry- 
ing genome B. Animals playing "hawk" will thus enhance their fit- 
nesses with greater relative frequency and, ceteris paribus, will 
produce more offspring, than animals playing "dove". In consequence, 
genome Å is positively selected; 8, negatively selected. In theory, 
after a sufficient number of generations, the population will consist 
predominantly of pure "hawks"; pure "doves" will have become mar- 
ginalized. 

This explanation for the natural prevalence of the pure "hawk" 
strategy, albeit advanced in a game-theoretic model that is arguably 
over-simplified, is nonetheless interesting in neo-Darvinian terms. 
The explanation becomes more compelling when one considers the case 
to which it does not apply; namely, case (3), in which V< C. When 
the expected enhancement of fitness resulting from the possession of 
a resource is less than the expected reduction in fitness resulting 
from the attempt to monopolize it, the competitor is confronted by a 
novel situation. In this case, Maynard Smith's evolutionary model 
embodies a problem hitherto unseen in previous Prisoner's Dilemmas, 
but copiously apparent in nature. 

Many animal species are equipped with physical or chemical 
weapons, lethal not only to their predators or prey, but also to con- 
specifics. Since natural selection has favoured the evolution of such 
weapons, it must also have favoured behavioural patterns that prevent 
armed con-specifics from annihilating one another. Indeed, the 
phenomenon of limited or ritualized con-specific combat abounds in 
the arenas of nature. From the mantis shrimp which batter one another 
on their heavily-armoured tails, to the venomous serpents which 
wrestle one another instead of unsheathing their deadly fangs, to the 
wolves which expose their jugulars in combat-terminating gestures of 
appeasement, one observes a myriad of ways in which conspecific 
competition over fitness-enhancing resources is conducted in a 
strenuous yet neither fatal nor debilitating fashion. 

The competitors in the game-theoretic model cannot reflect this 
behaviour by adopting pure strategies, be they "hawk" or "dove". It 
is here that Maynard Smith makes an ingenious contribution to evolu- 
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tionary games, by introducing the concept of an evolutionarily stable 
strategy (ESS). Suppose that there exists a mixed strategy J, which 
consists in playing H with probability p, and in playing D with 
probability (1-p). Again, within the paradigm of neo-Darwinism, it is 
presumed that the phenotypic behavioural pattern giving rise to 
strategy I is mediated by a genome that determines the probability 
distribution. If I is optimally effective in terms of fitness-enhan- 
cement, then this genome will be positively selected. Such an optimal 
mixed strategy, for a given species, is called an ESS. 
An ESS is defined as a strategy such that 


.if most of the members of a population adopt it, 
there is no “mutant 5 strategy that would give higher 
reproductive fitness”, 


or, alternatively, as a strategy such that 


.if all the members of a population adopt it, then 
no mutant strategy could invade, the population under the 
influence of natural selection." 


Explicitly, to find probability p such that J is an ESS, 
Maynard Smith makes use of the Bishop-Cannings theorem! amd writes 
the following equation: 


EU(H, I) = EUD, I) 


In other words, the expected utility of playing "hawk" against an ESS 
is the same as that of playing "dove" against it. If one finds the 
probability distribution that satisfies this equation, one has the 
probability distribution of the £55 itself. Maynard Smith then solves 


this equation for game 9.1: 


i 3. Maynard Smith & G. Price, ~The Logic of Animal Conflict', 
Nature, 246, 1973, pp.15-18. 


é Maynard Smith, 1982, p.10. 

å T. Bishop & C. Cannings, "À Generalized War of Attrition’, 
Journal of Theoretical Biology. 70, 1978, pp.85-124. They prove that 
if I is a mixed ESS with component strategies a,b,...,z then Ella,7) 
= EUb,D =... = Ellz,D = ELD. 


8 Maynard Smith, 1982, pp.15-16. 
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P[(1/2) (0) ] + (1-p)V = p(0) + (1-p) (1/2) V 
or p= WC 


Thus, in the case when V< C, it is evolutionarily stable to 
adopt the "hawk" strategy with probability WC, and the "dove" 
strategy with probability (1-W0). 

Now, Axelrod and Hamilton attempt to apply the concept of 
evolutionarily stable strategy to the iterated Prisoner's Dilemma.’ 
It would be useful indeed if evolutionary game theory could point to 
an optimally effective mixed strategy in the Prisoner's Dilemma. 
Unfortunately, the theory cannot do so, for the simple reason that no 
ESS exists in the Prisoner's Dilemma. A rigorous proof that no ESS 
can be found for the Prisoner's Dilemma is given elsewhere." For the 
purposes of this enquiry. a brief demonstration can be made that the 
concept of ESS is inapplicable to the Prisoner's Dilemma. 

The demonstration takes the form of a reductio ad absurdum. Let 
one assume that an ESS exists in the Prisoner's Dilemma, and let the 
Maynard Smith equation be applied to find its explicit probability 
distribution. 


Game 9.3 — The Prisoner's Dilemma 


where T> R> P? S 
With respect to game 9.3, the Maynard Smith equation is written 


: R. Axelrod & MW. Hamilton, “The Evolution of Cooperation’, 
Science, 211, 1981, pp.1390-6. 


10 L. Marinoff, `The Inapplicability of Evolutionarily Stable 
Strategy to the Prisoner's Dilemma', The British Journal for the 
Philosophy of Science, 41, 1990, pp.458-470 (pending). 
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EUCC, I) = E(D, I) 


In other words, the expected utility of co-operating against the £50 
is equal to the expected utility of defecting against it. Explicitly, 


PCR) + (i-p)S= p(T) + (1-p (P 
or 


p= (PH/URD-AD] (9.1) 


Since pis a (real) probability, its permissible values are 0 < 
p< 1. Hence, these are also the permissible values for the right- 
hand side of equation (9.1). 

Now, consider the quotient (P-5)/[(#+P)-(StT)]. The numerator, 
(P-5), is always greater than zero (since by definition, P > 5). 
Thus, to satisfy the constraint on the permissible values of p, the 
denominator (ØP)-(S+7) must be greater than or equal to the numera- 
tor. That is, 


(R+P)-(S+T) 2 (FS) 
or 


I 
Iv 
H 


(9.2) 


But inequality (9.2) cannot be satisfied, since, by definition, 
T> R. Thus, (PS > (AP -(S1). 

In consequence, the quotient (P-9/((AP-(S+7] is either 
greater than unity [if 0 < (RAP-(S+7 < (P-9)1, or less than zero 
[if (MP-(S7) < 0]. But these are precisely the values of p that 
fail to satisfy the constraint on equation (9.1). Since equation 
(9.1) has no solution such that p is a real probability (0 < p< 1), 
therefore no ESS exists in the Prisoner's Dilemma. 

Structurally, it is not difficult to see why this is so. The 
transitive ordering of payoffs in the Prisoner's Dilemma is T> R> P 
> S (or, in the case of weak dominance, T > R > P= 5). These order- 
ings correspond to cases (1) and (2) (where V > Cand V= C, respec- 
tively), of Maynard Smith's evolutionary model. But it is case (3) of 
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the evolutionary model (where V < C) that prompts Maynard Smith's 
search for an ESS. But in case (3), the transitive ordering of 
payoffs is T > R> S> P. This is not a Prisoner's Dilemma. So the 
existence of an.ESS in case (3) cannot and does not imply the exis- 
tence of an ESS in the Prisoner's Dilemma. The two games belong to 
utterly different classes, according to the respective orderings of 
their payoffs. 

However, the inapplicability of the concept of ESS to the 
Prisoner's Dilemma does not preclude ecological modelling. Axelrod 
develops a very interesting scenario in his second tournament, based 
upon an ecological perspective." 
model are as follows. 

Suppose that the total payoffs accrued (that is, points scored) 
by some strategy å, in competition against other strategies, repre- 


The principal assumptions in the 


sent the initial population of Astrategists in the first generation 
of the tournament. The relative population of A-strategists in that 
generation is therefore the ratio of strategy A's total points scored 
to the sum of total points scored by all strategies. Similarly, each 
competing strategy represents a unique population of strategists, 
whose relative frequency is the ratio of that strategy's total points 
scored to the sum of total points scored by all strategies. 

Next, one simulates future generations of the tournament. The 
ratio of total points scored by strategy A to the sum of total points 
scored by all strategies in the nt generation, represents the popu- 
lation of A-strategist offspring, descended from A-strategists in the 


t 


(n-1)* generation, presently competing in the n ‘generation. Depend- 


ing on how they fare against other strategists in the overall popula- 
tion, these A-strategists will produce a relative number of offspring 
who compete in the (nt1)* generation, and so forth. 

Axelrod explains how his ecosystemic competition is conducted: 


"We simply have to interpret the average payoff received by an 
individual as proportional to that individual's expected number 
of offspring. For example, if one rule gets twice as high a 
tournament score in the initial round as another rule, then it 
will be twice as well-represented inthe next round. This 
creates a simulated second generation of the tournament in 
which the average score achieved by arule is the weighted 


ll Axelrod, 1980b, pp.398-401. 
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average of its score with each of the rules, where the weights 
are proportional ta the success of the other rules in the 
initial generation." 


Ås Axelrod indicates, this process simulates "survival of the fit- 
test": 


"A rule which is successful on average with the current 
distribution other rules in the population will become an 
even larger proportion of the environment of the other 
rules in the next generation. At first a rule which is 
successful with all sorts of rules will proliferate, but 
later as the unsuccessful rules disappear, success 
requires good performance with other successful rules." 


When Axelrod conducts his ecological experiment with the sixty- 
three strategies of his second tournament, he finds that, after 500 
generations, only eleven strategies have increased their relative 
sizes in the population, and that these strategies ranked uppermost 
in the parent generation. After 1000 generations, only six strategies 
continue to increase their relative numbers of offspring (and they 
ranked first, third, second, sixth, seventh and ninth originally). Of 
these, Axelrod finds that TFT has produced the greatest number of 
offspring, and that TFT continues to grow at the most rapid rate." 

Axelrod's ecological scenario is emulated in the environment of 
the interactive tournament. Axelrod does not explicitly state the 
algorithm he uses to simulate future generations, but the algorithm 
developed in this enquiry embodies the main precepts of Axelrod's 
model. When strategy A encounters strategy Bin the initial genera- 
tion, the ratio of their scores is interpreted as the ratio of their 
offspring produced in competition against one another. The likelihood 
with which these offspring encounter one another in the second 
generation is proportional to the relative numbers of A-strategists 
and B-strategists in the initial generation. This algorithm is 
applied to all strategies in the environment, and is iterated for a 
sufficient number of generations, until all rates of growth (and 
decline) subside to a quiescent state. 


l Ibid, pp.398-9. 


3 Ibid, p.399. 


M Ibid, pp.400-1. 


162 


A fairly straightforward mathematical notation is introduced, 
in order to show explicitly how this algorithm functions. (The actual 
documented computer program is listed in Appendix Four, program 
A4.12.) 

(AvB), means "the relative number of nº generation offspring 
produced by A-strategists in competition against B-strategists." Thus 
(AÐ, is strategy A's tournament score against strategy BÐ. 

(TOD jo means "the total relative number of n? generation 
offspring produced by A-strategists in competition against all 
strategists"; 

Ties, (TOT) å » = (AvA) ,+ (AVB) , + (AVC) t ++ ++ (AND, 
for Z different strategies in the environment. Thus (TOT) 41 is stra- 
tegy A's total score jin the tournament. 

(SUM), means "the total relative number of all strategists in 
the rf! generation"; 

Lev, (SUM , = (TOT) å pt ( TOT) Bat (TOT) go + NE (TOT) za 
for Z different strategies in the environment. Thus (SUM), is the sum 
of all offensive scores in the tournament. 

(FRE) å» means "the relative frequency of A-strategists in the 
nº generation": 

i.e., (FRE); = [( TOT), p I / LCSUM | 

Thus, the relative frequency of A-strategists in the initial 
generation, (CRE) 411. is the ratio of strategy A's total offensive 
score [(707) 44] to the sum of all strategies' total offensive scores 
[XSUM, ]. Note that all such relative frequencies, in the initial 
generation, are computed directly from the tournament matrix of raw 
scores (see Appendix Two). 

The raw scores for the second generation of A-strategists are 


then computed from the following recurrence relation: 
An exhaustive implementation of this recurrence relation yields 


a second-generation matrix of raw scores, or relative numbers of 
offspring (for Z different strategies): 
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(AVA), (WAR... (AZ, 
(BA), (NB ... (HZ, 
(NA, (NB ... (ND, 


Once this matrix has been computed, the above-described procedure for 
finding the relative frequencies is implemented for this second 
generation. 

In general, then, for each subsequent generation, the recur— 
rence relation 


(AvBy = (AVD [ (TOT) 11/11 (7011, + (107) 


is used to compute the new matrix of offspring, from which each 
strategy's relative frequency in that generation can be found. 
Note that if the nt generation ratio of offspring, 
((AvB),1/((BvA)}], has the numerical value a/b, then the (n+1) # 
generation ratio will be 


(AVE) yl / (VA gy] = (D (TON 1/10 TOD, 


This satisfies the two principal requirements of Axelrod's model; 
namely, that the ratio of offspring between two competing strategies, 
in any future generation, be proportional to 

(i) their ratio of offspring in the previous generation, and 

(11) their relative frequencies in the previous generation. 

The relative frequency of each strategy's progeny, in a given 
generation, is expressed in parts per thousand (ppt) of the overall 
population in that generation. The ecological scenario involving the 
twenty strategies of the interactive tournament attains a stable 
state after about 325 generations. That is to say, following the 
325% 
all strategies’ cumulative increases or decreases in relative fre- 
quency are less than one part per thousand over the next several 
generations. Although minor fluctuations continue to take place, in 


generation, the rate of change has slowed to the extent that 


increments (or decrements) of parts per ten thousand per generation 
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and less, these fluctuations are negligible on the scale of the 


scenario. 

The results of the ecological scenario involving twenty strate— 
gies are displayed in the following bar-chart, which shows the 
initial (parent generation) and stable (325% generation) frequencies 
for each strategy. The strategies appear, from left to right, in 
descending order of their stable frequencies. 

It is clear from graph 9.1 that MAC, which has the largest 
initial frequency (60 ppt), experiences the greatest increase, to a 
stable frequency of 142 ppt. This represents an increase of 82 ppt 
over 325 generations, or an average growth rate of 0.25 ppt per 
generation. And MAE, which has the second largest initial frequency 
(57 ppt), experiences the second greatest increase, toa stable 


frequency of 119 ppt. MAE s average rate of growth is thus 0.19 ppt 
per generation. 


Graph 9.1 ECOLOGY OF MAIN TOURNAMENT 
Initial vs. Stable States 
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Obviously, the size-order of the initial frequencies is identi- 
cal to the rank-order of the tournament, since a strategy's initial 
frequency is its total tournament score divided by the sum of all 
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strategies' total tournament scores, and this dividend remains 
constant (for a given matrix of raw scores). However, the size-order 
of the stable frequencies does not necessarily correspond to that of 
the initial frequencies. For example, SHU ranks third in initial 
frequency (56 ppt), but slips to a distant fourth in stable frequency 
(67 ppt). SHU is overtaken by MEU, which ranks only seventh in 
initial frequency (53 ppt), but third in stable frequency (98 ppt). 
SHU's growth rate is 0.034 ppt per generation; MEU's, 0.14 ppt per 
generation. 

That MAC, MAE and MEU produce the greatest relative numbers of 
progeny, respectively, is a testament not only to their individual 
fitnesses, but also to the overall fitness of the maximization family 
in this ecosysten. 

At the other end of the spectrum, it it probably no coincidence 
that the greatest declines in frequency are experienced by CCC (-41 
ppt), NYD (-41 ppt), and TX (-38 ppt). Not only do these strategies 
have the lowest initial frequencies, but also, perhaps significantly, 
their order of ecological decline corresponds exactly to their order 
of points allowed in the interactive tournament. Moreover, CCC has 
apparently become "extinct"; since, from the tenth generation onward, 
its relative frequency is zero ppt. 

The reproductive fortunes of the eight most fecund strategies, 
in terms of their instantaneous rates of change, can be gauged from 
the following graph (overleaf): 
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Graph 9.2 RATES OF POPULATION CHANGE 
Most Fecund Strategies 


o Frequency per 1000 Population 
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Given that one of the strategies (CCC) has become extinct in 
this ecosystem, it seems reasonable to ask another question: what 
would happen if the nineteen surviving strategies were to re-es- 
tablish themselves in a new ecological habitat, with corresponding 
initial conditions, and subject to the same generative algorithm, 
save that all CCC-strategists have disappeared from the ecosystem? 

The scenario is thus regenerated in a new ecosystem of nineteen 


surviving strategies, with the following result (overleaf): 
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Graph 9.3 ECOLOGY OF NINETEEN SURVIVORS 
Initial ve. Stable States 
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In this ecosystem, rates of growth and decline subside to 
negligibility after about 450 generations. Again, MAC has the largest 
initial frequency (62 ppt), and experiences the greatest increase, to 
a stable frequency of 176 ppt. SHU, with the second largest initial 
frequency (60 ppt), ranks fourth at stability (98 ppt). MAE, which 
has the sixth largest initial frequency (58 ppt), vaults past FRI, 
CHA, ETH, and SHU, to rank second at stability (138 ppt). And MEU, 
initially in a three-way tie for eighth place (55 ppt), finishes 
third at stability (99 ppt). The maximization family continues to 
exhibit reproductive fitness in this ecosystem. 

This procreative model is clearly sensitive to perturbation (by 
the removal or, inversely, by the addition of a competing strategy). 
The term "ecology" seems well-chosen by Axelrod, in that the extinc- 
tion of one strategy has palpable repercussions on the interactions 
among the nineteen survivors. In the original ecosystem, both MAE and 
MAC enjoy comparatively high reproductive success in competition with 
CCC. As soon as CCC becomes extinct, MAE falls from second to fourth 


168 


place in initial frequency; MEU, from sole possession of seventh to a 
three-way tie for eighth. That MAE and MEU now overtake numerous 
competitors, in order to finish second and third behind MAC, il- 
lustrates their fitness in regaining lost reproductive ground. 

The perturbation also results inthe extinction of two more 
strategies: in this new ecosystem, NYD's progeny vanish after the 
tenth generation; NX's, after the eleventh. Once again, the first 
strategy to become extinct in this ecosystem is the strategy with the 
lowest initial frequency (NYD, 41 ppt). But MC, which disappears one 
generation later, shares the second-lowest initial frequency with TAT 
(43 ppt). Although TAT experiences a sharp decline, it manages to 
stabilize at 8 ppt. 

The eliminatory process is continued by establishing another 
ecosystem, composed of the eighteen surviving strategies after the 
demise of NYD. This ecosystem is similarly procreated until stability 
is attained, whereupon another new ecosystem is formed, by deleting 
the next strategy to become extinct. This eliminatory process is 
repeated to its eventual conclusion. The results of all ecosystemic 
competitions are summarized in table 9.1 (overleaf). These results 
are revealing, and also somewhat intriguing. 

Each column of table 9.1 {except the last) is headed by two 
numbers. The first is the number of strategies competing in a given 
ecosystem; the second is the number of generations required to attain 
approximate stability in that ecosystem. These numbers alone yield 
interesting information. 


169 


Table 9.1 — Initial and Stable Frequencies, in Parts per Thousand 
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The number of generations required to stabilize an ecosystem is 
not (as might be expected) a smoothly decreasing function of the 
number of competing strategies. While such a trend is observable 
overall, many individual reversals of that trend are apparent. Since 
the number of generations required to attain stability diminishes 
only in tendency with the number of competing strategies, and seems 
to depend critically on the particular combination of strategies in 
competition, one can conclude that this eliminatory process is 
somewhat stochastic. 

(An exponential curve fit, which gives the number of genera- 
tions required to stabilize the population frequencies as a function 
of the number of strategies in competition, yields the result y= 
16.9 el69x 
expected, since all available (x,y) data are used. Å better fit is 
obtained by selecting fewer and more convenient (x,y) data: y= 


with the poor correlation r= .65. This low correlation is 


26e 8r with the improved correlation r = .82. These equations offer 
one possible explanation as to why Axelrod's ecosystem does not 
attain stability after 1000 generations. With 63 competitors, the 
second equation predicts that 155,000 generations are required to 
attain stability. Of course, any such extrapolation remains highly 
conjectural.) 

Each cell of table 9.1 contains three numbers: a given stra- 
tegy's initial frequency, its stable frequency, and its change in 
frequency; all in parts per thousand of the population for the given 
ecosystemic competition. 

One might use table 9.1 to follow the fortunes of the maximiza- 
tion family, which dominates the stable populations of ecosystems 
involving twenty and nineteen strategies. In the ecosystem involving 
18 strategies (following the extinction of NYD), SHU holds the 
greatest initial frequency (65 ppt), while MAC and FRI are tied with 
the second greatest (64 ppt). FRI experiences the largest increase, 
however, and realizes the greatest stable frequency (154 ppt), 
followed by SHU (152 ppt) and MAC (144 ppt). MAE initially ranks 
seventh (60 ppt), but climbs to fourth at stability (114 ppt), while 
MEU initially ranks ninth (56 ppt) but finishes fifth (79 ppt). Thus 
MAC, MAE, and MEU continue to perform quite well, but they slip to 
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third, fourth and fifth places with respect to magnitudes of stable 
frequencies. 

A glance at the tournament matrix of raw scores (Appendix Two) 
affords an explanation for what is taking place. In the context of 
the tournament, the maximization family fared extremely well against 
CCC and NYD. In fact, each member of the maximization family realizes 
its two highest scores against these very strategies. But in the 
ecological context, this large margin of success not only contributes 
to the rapid extinction of the weaker strategies, but also proves 
detrimental to the exploitive ones. 

In the tournament, for example, MAC out-scored CCC by 4824 to 
264. So in the ecological scenario, their parent generation ratio is 
thus 4824:264, or about 18:1 in favour of MAC. And in the parent 
generation of the twenty-strategy ecosystem, their respective initial 
frequencies are 60 and 41 ppt of the overall population. Thus the 
ratio of their second-generation offspring is (4824x60) : (264x41), or 
about 27:1 in favour of MAC. In the tournament context, MAC exploits 
CCC rather heavily (as do many other strategies) with no dire conse- 
quences to itself. But in the ecological context, MAC's heavy ex- 
ploitation of CCC has a three-fold result. 

First, MAC benefits from a proportionately large increase in 
progeny. Second, CCC, which experiences a generally poor differential 
procreative rate in the ecosystem as a whole, is unable to stave off 
elimination. Third, in subsequent ecosystems, MAC no longer benefits 
from its high procreative rate in competition against CCC, since CCC 
is extinct. In future ecosystems, MAC must compete more frequently 
against strategies with greater procreative fitness than CCC, strate— 
gies which MAC cannot exploit as readily. 

This is a classic instance of over-exploitation of a resource, 
to the eventual detriment of the exploiters. All strategies that 
over-exploit CCC (such as DDD, NX, TAT, TES, and the maximization 
family) abet CCC's rapid extinction, and in so doing deprive themsel- 
ves of a competitor which allows them to create large relative 
numbers of progeny. When a new ecosystem is established, with CCC 
absent from the environment, the population frequencies undergo an 
ecological shift, such that those strategies which over-exploited the 
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extinct competitor now experience corresponding declines in their 
procreative rates. In future ecosystems, former exploiters may 
themselves become the victims of exploitation. 

In the ecosystem with 17 strategies (following the extinction 
of 70), the stable order is once again FRI (136 ppt), SHU (130 ppt), 
MAC (129 ppt), MAE (108 ppt), and MEU (86 ppt). The population gaps 
between these upper five strategies have closed, compared with the 
previous ecosystem. And now, with TAT's extinction, one observes 
that, in the first four ecosystems, the lower four strategies of the 
tournament have become extinct, in reverse-order of their tournament 
ranks, from twentieth to seventeenth (CCC, TRC, NYD, TAT). 

TAT's extinction (combined with the previous extinctions) 
results in a re-ordering of initial frequencies in the next ecosys- 
tem, which precipitates new stable standings. In the ecosystem with 
sixteen strategies, CHA (76 ppt), ETH (75 ppt), and TFT (71 ppt) are 
most successful, both initially and at stability, realizing eventual 
frequencies of 125, 122, and 102 ppt respectively. TTT places fourth 
at stability (95 ppt), while SHU manages a tie for fifth with TES (93 
ppt). Evidently, TAT's extinction results in a complete upheaval in 
the environment, with new strategies in the ascendancy, and previous- 
ly successful strategies in decline. MAC slips to seventh at stabili- 
ty (86 ppt); MAE, ninth (66 ppt). Moreover, this ecosystemic competi- 
tion requires the greatest number of generations (more than five 
hundred) to settle down. In addition, the precedent for extinction is 
broken. DDD (which ranks ahead of RAN in the tournament) now vanishes 
from the ecology. 

In ecosystems involving from fifteen to ten competitors, CHA 
and ETH continue to predominate at stability, while MAC, SHU, TFT and 
TIT also tend to flourish. In ecosystems involving from nine to five 
competitiors, TFT ranks first four times and second once. The ecosys- 
temic competition of seven strategies is won by TES. In this competi- 
tion, TES experiences the greatest increase of any strategy in any 
ecosystem, from an initial frequency of 150 ppt to a stable frequen- 
cy, after 36 generations, of 397 ppt. But TES becomes extinct in the 
ecosystem of five competitors. 
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The final ecosystem is composed of four nice strategies: TFT, 
SHU, GRO, and ETH. In such a system, all future generations of 
progeny maintain respective ratios of 1:1. Thus, initial frequencies 
and stable frequencies are identical and equal to one another, and 
stability is attained in the parent generation. This situation would, 
of course, obtain in an ecosystem of any size, providing that it were 
composed exclusively of nice strategies. The other nice strategies, 
however (namely CHA, TIT, FRI, NYD and CCC), are already extinct, 
because their respective combinations of attributes were disfavoured 
in previous ecosystemic competitions. 

The last column of table 9.1 contains three numbers for each 
strategy. The first is that strategy's aggregate increase (or dec- 
rease) in frequency, cumulative over the entire ecological scenario; 
in other words, its overall fecundity. The second is the total number 
of generations survived by that strategy in all ecosystemic competi- 
tions (conducted to stability); in other words, its longevity. The 
third number is the quotient of the first two; in other words, is 
that strategy's average rate of increase (or decrease) in frequency, 
in parts per thousand per generation extant. 

The entries show that, while the nine earliest-extinct strate— 
gies have aggregate decreases in frequency, three of the last four to 
become extinct, as well as one of the survivors, also have have 
aggregate decreases. Thus, while an aggregate increase in frequency 
indicates that a competitor does not face early extinction, neither 
is it a passport to ultimate survival. 

One might find the survival of GRO perplexing. GRO experiences 
an increase in only two of the sixteen ecosystemic competitions that 
result in an extinction: nonetheless GRO survives to the final 
ecosystem. GRO does not excel in any of these competitions, and ranks 
near the bottom in all of them. Yet GKO is tenacious enough to 
survive them all, apparently by dint of consistent mediocrity. Since 
GRO is never highly successful, it cannot be said to depend on any 
particular strategies for its success. Hence GRO is not subject to 
the vicissitudes of over-exploitation, which cause the rise and fall 


of many of its more successful, and later extinct competitors. 
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Similarly, TES is the last strategy to become extinct. Despite 
its superlative performance in the one ecosystemic competition, TES 
also has an aggregate decrease in frequency. 

On the other side of the coin, one finds that ETH has the 
largest aggregate increase in frequency, yet ETH won only two of the 
eliminatory ecosystemic competitions. Moreover, CHA has a larger 
aggregate increase than three of the four survivors, yet CHA even- 
tually succumbs to extinction. 

Table 9.2 illustrates the waxing and waning fortunes of the top 
five ranking strategies, at stability, for each of the ecosystemic 
competitions. 


Table 9.2 — Top Five Strategies, With Respect to Stable Frequency 


Conpet - First Second Third Fourth Fifth 
itors Place Place Place Place Place 
20 MAC ME EY SHU IES 
19 MAC ME HEV SHU FRI 
18 FRI SH MC ME MY 
17 FRI SH Hac ME EV 
16 Ca ETH IFT IIT U 
15 ma ETH IFT SHY ITT 
14 Ga EM MAC, SHU = IFT 
13 GA MAC SH ETH IFT 
12 ca EM MC ITT IFT 
u FTK Ga TT IFT IE 
10 ETH Ga lt 680 MC 
9 IFT ETH TES ITT CHA 
8 IFT, ETH, SHU - - ERO GA 
7 Is IFT, SHU, ETH - - GRO 
6 IFT, SHU, ETH - - TES ERO 
5 IFT, SHU, ETH - - ERO IE 


Table 9.2 shows that, in general, the maximization family is 
most successful in the larger ecosystems; the optimization family, in 
the medium-sized ecosystems; the tit-for-tat family, in the smaller 
ecosystems. But no single strategy emerges as most robust overall, if 
the sole criterion of robustness is stable frequency. Indeed, though 
several strategies claim varying degrees of success in different 
sizes of ecosystem, it does not seem possible to ascribe a coherent 
order of robustness from one criterion alone. 
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One criterion suffices for Axelrod, who conducts a single 
ecosystemic competition among sixty-three strategies. Based solely on 
its magnitude of relative frequency in the population, TFT wins that 
particular competition. However, given what transpires in eliminatory 
ecosystemic competitions in the environment of the interactive 
tournament, it seems feasible to speculate that, if a similar range 
of competitions were conducted in Axelrod's environment, no single 
strategy would win them all. It seems rather more likely that one 
would observe a similar waxing and waning of strategic procreativity 
in different ecosystems. 

Be that as it may, the question remains: how can one assess 
robustness across the range of ecosystemic competitions? Clearly, 
there is no unique way to accomplish this task. One possible method 
consists in a parametric approach, where relative robustness can be 
quantified according to certain parameters. The parameters themselves 
are quantifications of vital attributes of robustness in the ecologi- 
cal context. In other words, the above question is answered in three 
stages. First, vital properties of an ideal ecologically-robust 
strategy are posited. Second, the varying extents to which the 
competing strategies embody these properties are quantified according 
to appropriate ranking schemes. Third, these quantifications serve as 
parameters which reflect each strategy's combined embodiment of vital 
properties, and which permit a corresponding overall index of robust- 
ness to be assigned. 

This enquiry utilizes four parameters, drawn from the ecologi- 
cal scenario. Four vital properties of an ecologically-robust stra- 
tegy are posited, and their corresponding parameters defined, as 
follows: 

(1) The ideal ecologically-robust strategy's progeny are able 
to avoid extinction. Hence the first parameter is survival, or 
ecosystemic longevity. Each strategy is ranked in ascending order of 
the total number of generations during which its progeny avoid 
extinction (regardless of their relative frequencies, if non-zero). 

(2) The ideal ecologically-robust strategy is reproductively 
fit; i.e., its number of progeny increases in future generations. 


Hence the second parameter is overall average increase in relative 
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population frequency, between initial and stable states of every 
ecosystem. Each strategy is ranked in ascending order of the quotient 
of its aggregate frequency and the number of generations its progeny 
survive. This quotient is thus a measure of a strategy's average 
increase in relative frequency, in parts per thousand of the popula- 
tion per generation extant. (A negative increase, of course, indi- 
cates a decrease.) 

(3) The ideal ecologically-robust strategy maintains a consis- 
tently high stable frequency, from one ecosystemic competition to 
another. Hence the third parameter is overall stable efficiency. A 
strategy's stable efficiency is computed in the following way. 
Suppose that strategy A has the få largest stable frequency in an 
ecosystemic competition involving k competitors (including itself). 
Thus, strategy A achieves a higher stable frequency than (k-j) other 
competitors. Its best possible performance (if it finishes first) 
entails achieving a higher stable frequency than (k-1) other com- 
petitors (excluding itself). Hence, strategy A's relative stable 
efficiency in this competition is (k-)/(k-1). Strategy A's overall 
relative stable efficiency, in n ecosystemic competitions, is there- 
fore 


(kj) + (GT) +... + (kipli K + (1) +... + (KI 


which is the net ratio of the number of competitors it betters to the 
number of competitors it faces. Each strategy is ranked ir ascending 
order of its overall stable efficiency. 

(4) The ideal ecologically-robust strategy shows adaptivity 
across the range of ecosystemic competitions, by means of consistent 
improvement within them. That is, it consistently increases its 
frequency, relative to other competitors, thereby tending to improve 
its position in a given competition. Hence the fourth parameter is 
the sum of the fractions of competitors overtaken in each competi- 
tion, divided by the total number of competitions. If a strategy 
overtakes 4/k competitors in its first competition, i#/k, com- 
petitors in its second competition, and so on, up to and including 
j,k competitors in its n competition, then that strategy's average 
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n 
(jn) 2 søk; 


Fl 


Each strategy is ranked in ascending order of the signed magnitude of 
its adaptivity, whose dimensions are: average fraction of competitors 
overtaken, per competition. (A negative adaptivity obtains when a 
strategy is overtaken by more competitors than it overtakes.) 

The rankings of the strategies according to parameters (1) and 
(2), net longevity and average fecundity, are determined from table 
9.1. The rankings according to parameters (3) and (4), stable ef- 
ficiency and adaptivity, are determined from table 9.3 (overleaf). 

Each cell of table 9.3 displays the initial and stable rankings 
for a given strategy in a given competition, according to the stra- 
tegy's initial and stable relative frequencies. The given cell then 
displays that strategy's point-award in that competition (to be used 
in the calculation of its overall stable efficiency) and its change 
in frequency rank (to be used in the calculation of its adaptivity). 
For example, in the competition involving twenty strategies, MEU's 
initial and stable frequencies are seventh and third-largest, respec- 
tively; whence the entry "7-3". So, at stability, MEV betters 20 
minus 3, or 17 competitors; and MEU's change in rank is 7 minus 3, or 
+4, whence the entry "+4,17". 

Once again, a strategy's overall relative stable efficiency is 
the ratio of the sum of its stable point-awards (the total number of 
competitors it betters) to the total number of competitors it faces. 
MEU, for example, faced a total of (19+18+17+16+15+14+13+12+11) or 
135 competitors, in consecutive ecosystemic competitions, before 
becoming extinct. MEU bettered a total of (17+16+13+12+5+4+2+1+0) or 
70 competitors, in terms of stable frequency rankings in these 
competitions. Hence, MEU's overall stable efficiency is 70/135, or 
51.9 percent. By contrast, for example, TIT faced a total of 175 
competitors (it survived more competitions than did MEU), of which it 
bettered a total of 94. TIT'S overall stable efficiency is thus 


94/175, or 53.7 percent. 


Table 9.3 — Fecundity Rankings at Initial and Stable Frequencies 


3 3 2 82 8 8 ËH 


E 


= 


sum (4/19 + 5/18 + 5/17 + 5/16 + 1/15 + 1/14 + 0/13 + 0/12 + 


2 19 18 17 


7-10 7-8 6-6 6-6 
-3,10 -1,11 +0,12 +0,11 
34 24 1-2 2-2 
-1,16 -2,15 -1,16 +0,15 
11-17 11-14 11-15 10-15 
-6,3 -3,5 4,3 5,2 
4-9 3-7 47 2-9 
-5,11 -4,12 -3,11 -7,8 
7-5 8-11 7-9 7-7 
+2,15 -3,8 -2,9 +0,10 
4-8 36 48 149 
-4,12 -3,13 -4,10 -7,9 
10-13 8-9 9-10 7-12 
-3,7 -1,10 -1,8 -5,5 
4-6 35 21 51 
-2,14 -2,14 +1,17 +4,16 
1-1 1-1 2-3 4-3 
+0,19 +0,18 -1,15 +1,14 
11-15 11-13 11-11 10-13 
-4,5 -2,6 +0,7 -3,4 
2-2 6-2 74 9-4 
+0,18 44,17 +3,14 45,13 
7-3 8-3 105 105 
+4,17 +5,16 +5,13 +5,12 
15-16 15-16 14-16 14-16 
“1,4 -1,3 -2,2 -2,1 
13-11 13-12 13-14 13-14 
42,9 41,7 -1,4 -1,3 
13-7 14-10 15-11 15-10 
+6,13 44,9 44,7 45,7 
15-12 16-15 15-13 15-11 
43,8 +1,4 42,5 +4,6 
17-14 18-17 18-17 17-17 
+3,6 +1,2 +1,1 +0,0 
19-18 18-18 17-17 | 
+1,2 +0,1 +0,2 
18-19 16-18 _ 
-1,1 -2,1 
20-20 _ 
+0,0 


6 5 14 B R U 10 9 


9-3 3-3 35 35 35 44 36 1-1 
+0,13 +0,12 -2,9 -2,8 -2,7 +0,7 -3,4 +0,8 
5-5 4-4 5-3 5-3 68 67 78 5-6 
+0,11 +0,11 +2,11 42,10 -2,4 -1,4 -1,2 -1,3 
9-12 9-12 8-10 8-10 7-10 6-8 54 7-7 
-3,4 -3,3 -2,4 -2,3 -3,2 -2,3 +1,6 +0,2 
2-2 1-2 22 14 1-2 14 1-1 22 
+0,14 -1,13 +0,12 -3,9 -1,10 +0,10 +0,9 +0,7 
5-5 66 5-7 58 56 55 57 4-3 
+0,11 40,0 -2,7 -3,5 -1,6 +0,6 -2,3 +1,6 
HH HHR 
+0,15 +0,14 +0,13 +0,12 +0,11 +0,9 +0,8 -2,4 
4-4 45 48 4-7 34 3-3 3-3 5-4 
+0,12 -1,10 4,6 -3,6 -1,8 20,8 40,7 +1,5 
8-8 88 86 96 %9 9-9 %9 8-9 
+0,8 40,7 +2,8 +3,7 +0,3 +0,2 40,1 +0,1 
7-7 6-6 7-3 7-2 7-3 8-6 8-5 9-9 
+0,9 40,9 +4,11 45,11 4,9 42,5 +3,5 +0,0 
11-10 11-10 11-11 11-11 11-11 10-10 10-10 _ 
#1,6 +1,5 20,3 +0,2 40,1 40,1 +0,0 

10-9 10-9 10-9 10-9 10-7 1-11 
+1,7 +1,6 +1,5 +1,4 +3,5 +0,0 
12-11 12-11 12-12 12-12 12-12 _ 

+1,5 +1,4 40,2 +0,1 +0,0 
13-14 13-14 13-13 13-13 | 
-1,2 -1,1 +0,1 +0,0 
14-13 14-13 14-14 | 
+1,3 +41,2 +0,0 

15-15 15-15 _ 

0,1 20,0 
16-16 | 
+0,0 


178 


And MEU's average adaptivity, for example, is found from the 
0/11) 


[the fractions of competitors it overtakes 


in each competition], 
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divided by 9 [the total number of competitions in which MEU com- 
petes]. MEU's adaptivity is therefore +.137. By contrast, TIT's 
adaptivity is (-3/19 — 1/18 — 1/17 — 5/16 — 0/15 — 1/14 — 4/13 — 3/12 
— 1/11 + 0/10 +. 0/9 + 1/8 + 1/7 + 0/6) divided by 14, or —.074. 50, 
in terms of difference between stable and initial population fre- 
quency, MEU overtakes, on average, .137 of its competitors per 
competition; while TIT is overtaken, on average, by .074 of its 
competitors per competition. 

Thus, while TTT is slightly more efficient than MEU, it is also 
considerably less adaptive. All strategies’ overall stable efficien- 
cies, and adaptivities, are displayed in the last column of table 
9.3. 

Now, from tables 9.1 and 9.3, one has four different ranking 
schemes, which order the strategies in terms of the four parameters: 
longevity, fecundity, stability, and adaptivity. These rank parame— 
ters are abbreviated, respectively, as Rj i Ro and R,. With each 
strategy is then associated a unique set of four rank numbers, which 
correspond to that strategy's particular values for (Rj, R, R,, Å. 

A given strategy's index of robustness, I,, is evaluated in the 
following way. Each of its four rank numbers is subtracted from 
twenty, to give the number of competitors it betters according to 
each parameter. These four new numbers are then added, and their sum 
is divided by 76 (which is the total number of competitors it could 
have bettered overall; i.e., nineteen competitors in each of four 


schemes). This quotient is the given strategy's index of robustness. 
That is, 


I, = ((20-R) + (20-R;) + (20-R) + (20-R,)1/76 
or 


i= [80 — (R, + RAR, + R,)1/76 


The ideal ecologically-robust strategy would rank first in each 
scheme, and its index of robustness would then attain the maximum 
value of unity. An utterly non-robust strategy would rank twentieth 
in each scheme, and its index of robustness would take on the minimum 
value of zero. 
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The magnitudes of the four parameters, their corresponding rank 
numbers, and the resulting indices of robustness are displayed in 
table 9.4. 

According-to this parametric approach, MAC is the most ecologi- 
cally-robust strategy, followed by SHU, ETH, TFT and MAE, to round 
out the top five. Although MAC became extinct earlier than its most 
robust rivals (which rank first in longevity compared to MAC's 
ninth), these rivals prove comparatively less adaptive. In fact, SHU, 
ETH and TFT are all negatively-adapted; that is, they are surpassed, 
on average, by a larger fraction of competitors than they surpass. 

These parameters are quite revealing with respect to the 
competitive performance of a given strategy, as indeed they must be 
if they are to provide a reasonable quantification of robustness. 


å ification of ological Robustness 


Longe- R} Avg Fe- Re Stable R, Adapti- R, L 

vity cundity EffX vity 
MAC 2995 9 +,117 3 77.2 2 +.135 3 829 
SHU 3261 1 +.109 4 75.9 3 -.025 11 .803 
ETH 3261 1 +, 232 l 80.2 1 -.089 17 .789 
TFT 3261 1 +.108 5 72.7 5 -.062 14 .T2A 
MAE 2666 11 +.059 6 61.4 6 +, 128 4 697 
MEU 2447 12 -.015 7 51.9 10 +.137 2 .645 
CHA 3227 6 +.150 2 74.4 4 -.236 19 645 
FRI 3106 8 -,042 10 58.0 7 +.038 7 .632 
MAD 1789 15 -.063 11 37.4 11 +.181 1 .553 
TES 3260 5 -.023 8 56.5 8 -,157 18 .539 
TIT 3179 7 -. 038 9 53.7 9 -.074 15 .526 
DDD 1259 16 -.085 12 27.1 13 +.116 5 44] 
GRO 3261 l -,159 16 27.3 12 -271 20 .408 
TD 2031 14 -,102 13 25.0 15 +.025 8 3% 
BBE 2917 10 -.1395 14 26.0 14 -. 034 12 395 
TAT 795 17 -2 17 12.9 16 +, 068 6 .316 
RAN 2383 13 -.145 15 11.3 17 -.061 13 . 289 
TOC 345 18 -.365 19 7.4 18 +.018 9 .211 
ccc 9 20 -4.60 20 0.0 20 £,000 10 .132 
NYD 335 19 -.297 18 3.4 19 -.082 16 16 


Examine the case of ETH, for example. ETH shares the greatest 
longevity, produces the largest average number of offspring per 
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generation, and is most efficient in overall stable frequency rank- 
ings. Given this outstanding combination of attributes, one might 
expect ETH to win a substantial number of ecosystemic competitions. 
Yet a glance at table 9.2 shows that ETH is the outright winner in 
only two of the competitions. Moreover, those competitons do not 
involve a relatively large number of strategies (11 and 10 strate- 
gies, respectively). In fact, in table 9.2, ETH is conspicuously 
absent from the top rankings in competitions involving 20, 19, 18 and 
17 strategies. Why does ETH not fare better? 

The fourth parameter provides an explanation. ETH turns out to 
be one of the least-adaptive strategies. ETH's great longevity, 
prodigious fecundity, and high efficiency do not reveal its principal 
weakness: in larger Groups, ETH is readily overtaken by a substantial 
fraction of competitors. These competitors, which produce fewer 
progeny on average than ETH, and which better fewer strategies 
overall than ETH, are nevertheless more reproductively fit than ETH 
when the competitive traffic is heaviest (as table 9.3 reveals). 
Thus, notwithstanding ETH's fortitude with respect to three attribu- 
tes, ETH s robustness is compromised by an acute lack of adaptivity 
in large groups. 

No single attribute, however outstanding, suffices for great 
robustness in eliminatory competitions. GRO, for example, has a share 
of the greatest longevity, but it experiences a considerable average 
decrease in fecundity, a middling stable efficiency, and the poorest 
adaptivity in the scenario. These results sink GRO to thirteenth 
place in robustness. Thus, while GRO endures, it neither thrives nor 
prospers. Similarly, MAD is the most adaptive strategy, surpassing a 
larger average fraction of its competitors than any other strategy; 
but MAD is fairly short-lived, mnegatively-fecund, and not very 
efficient. In sum, MAD ranks ninth in robustness. 

The seven most robust strategies, not surprisingly, are also 
the seven most fecund (though not in that order). The three most 
robust strategies are also the most efficient (though again, not in 
that order). Overall, fecundity and efficiency are the most closely 
correlated pair of attributes. But the two most robust strategies, 
MAC and SHU, show respective improvements in rank with respect to 
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this attribute pair. MAC ranks third in fecundity and second in 
efficiency; SHU, fourth in fecundity and third in efficiency. This 
type of improvement, however slight, denotes an interesting perfor- 
mance characteristic; namely, an effective frequency distribution of 
progeny across the range of ecosystemic competitions. 

Average relative frequencies, by definition, do not take 
instantaneous changes in frequency (from one competition to another) 
into account. MAC experiences an increase in progeny in nine of its 
twelve competitions; SHU, an increase in eleven of its seventeen 
competitions. Both MAC and SHU achieve frequency distributions which, 
in terms of rank efficiency, enable these strategies to realize the 
beneficial potential of their increases and to minimize the detrimen- 
tal effects of their decreases. 

In contrast, CHA ranks second in fecundity, but slips to fourth 
in efficiency. Although CHA's average increase in progeny is greater 
than that of MAC and SHU, CHA's distribution of instantaneous in- 
creases is less effective. CHA experiences an increase in progeny in 
ten of its fifteen competitions (and no change in one competition), 
but its largest increases occur in competitions in which a smaller 
increase would confer the same efficiency rank. In other words, CHA 
produces more offspring than it requires in some situations, and not 
enough in others. CHA is nonetheless relatively robust, although its 
robustness is severely compromised by its poor adaptivity. 

The point to be made here is that, notwithstanding instances of 
pair-wise correspondence between År and Å, these two rank parameters 
reflect quite distinct attributes. A given strategy's difference in 
rank between these parameters (or lack thereof) is indicative of a 
particular performance characteristic. 

Finally, one can compare strategic robustness in the combinato- 
ric sub-tournaments of the previous chapter with strategic robustness 
in this ecological scenario. The order of overall robustness is 
determined by taking the average of each strategy's rank with respect 
to combinatoric and ecological robustness. Since MAC ranks first in 
both categories, it is obviously most robust overall in the interac- 
tive environment. SHU is deserving of second overall, while MAE’ 


retains third overall despite its decline in the ecological scenario. 
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Table 9.5 — Comparison of Strategic Robustness 


Rank Combinatoric Ecological Overall 


Robustness Robustness Robustness 

1 MAC MAC MAC 

2 MAE SHU SHU 

3 SHU ETH MAE 

4 FRI TFT ETH 

5 CHA MAE TFT, CHA 
6 ETH MEU, CHA — 

7 TFT = FRI 

8 TES FRI MEU 

9 MEU MAD TES 

10 TIT TES MAD 

11 MAD TIT TIT 
12 GRO DDD GRO 
13 TOD GRO DDD, TOD 
14 BBE TID = 
15 DDD BEE BEE 
16 RAN TAT KAN, TAT 
17 TAT RAN — 

18 NYD TRC TOC 

19 TOC CCC NYD 
20 ccc NYD CC 


Once again, it must be stressed that this parametric approach 
to the evaluation of ecological robustness is by no means a unique 
determinant; many other schemes could be conceived and applied. The 
addition or deletion of a single parameter can alter the standings, 
either mildly or drastically. One might hypothesize that a parametric 
approximation of ecological robustness would improve as the number of 
parameters increases. While more (or fewer) than four parameters 
could be used, the result in this case seems reasonably unbiased. At 
the least, an attempt has been made to neutralise or otherwise 
balance any bias that inheres in such a quantification. 

The ecological scenario is clearly rich in interactions and 
implications, and many more such models can and should be developed 
within the evolutionary paradigm. The main difference between eco- 
logical and evolutionary modelling is, as Axelrod points out, that 
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the former does not admit of any "mutational” influences. In other 
words, the ecology unfolds strictly from initial conditions, with no 
behavioural modifications made to the strategies involved. However, 
it is evident that, on the basis of strategic interaction alone, and 
in the absence of strategic modification, the complexities of elimin- 
atory ecosystemic competition necessitate correspondingly complex 
methods of assessing robusteness. 

Having found MAC to be the most robust strategy overall in the 
interactive environment, according to combinatoric and ecological 
criteria that are admittedly not unique but also not necessarily 
unfair, this enquiry now seeks to answer some questions raised by 
these findings. Why is MAC most robust? Why are MAC's maximization 
family members less robust? Given that these family members differ 
only by an initial probabilistic weighting factor, why does their 
familial order of robustness increase with the co-operativeness of 
their respective weightings? What are MAC's principal weaknesses, and 
can they be improved? 

It is the task of the next section to address these and other 
pertinent questions. 


I Ibid, p.399. 


PART FOUR: 
THE ETHIC OF COLLECTIVE RATIONALITY 
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Chapter Ten 
The Maximization Family Versus Others 


Thus far, analyses of the interactive tournament have taken 
completed game scores as a departure point for the manipulation of 
data. A strategy's game scores have been treated as "finished pro- 
ducts", from which desired “by-products” (such as combinatoric and 
ecological robustness) are obtained. This part of the enquiry is 
devoted to an examination of the actual process by which the maxi- 
mization family "manufactures" its game scores. So, while Part Three 
can be said to have adopted a macroscopic view of the interactive 
tournament, the first two chapters of Part Four will adopt a micro- 
scopic view. This higher resolution of analysis should enable a 
better understanding of the mechanics of the maximization family, and 
of MAC's particular robustness in the interactive environment. 

Axelrod concludes the analysis of his second tournament with a 
salient observation: 


“Being able to exploit the exploitable without paying too 
high a cost with the others is a task which was not 
successfully accomplished by any of the entries in round 
two of the tournament." 


The main implication of his observation is that a strategy capable of 
accomplishing this task could have won the second tournament. 

The only maximization strategy to have participated in that 
tournament was Downing, an equivalent of MEU. Downing is fully able 
to exploit the exploitable, but Downing ranked fortieth among sixty- 
three strategies. This undistinguished performance is attributable to 
Downing having paid too high a price with the others. 

In the environment of the interactive tournament, the task 
defined by Axelrod is accomplished in large measure by MAC, and toa 
lesser extent by MAE. Although MEU and MAD are also able to exploit 
the exploitable, they (like Downing) pay too high a price with 
certain others. Indeed, it has been quite apparent that the perfor- 
mance of the maximization family members improves with the co-opera- 
tiveness of their weightings. To illustrate the performances of the 


l Axelrod, 1980b, pp.403. 
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maximization family in the context of Axelrod's task, one examines 
their games against two respresentative strategies: CCC (representing 
“the exploitable"), and TFT (representing "the others") 2 

Recall that the maximization family members begin their games 
by co-operating or defecting randomly for one hundred moves, while 
recording all joint outcomes in an event matrix. During these one 
hundred moves, MAC co-operates with probability 9/10; MAE, with 
probability 5/7; MEU, with probability 1/2; MAD, with probability 
1/10. From the one-hundred-and-first move onward, members of this 
family maximize their expected utilities, and continue to update the 
event matrix after each joint outcome. The frequency distribution of 
outcomes in the event matrix constitutes the a posteriori probability 
distribution in the calculation of expected utilities. 

Recall also that the respective payoffs of the possible out- 
comes are: 
(Cc) = 3,3; (Cd) = 0,5; (Doc) = 5,0; (D,d) = 1,1. Thus, if outcome 
(C,c) occurs on W occasions, (Cd) on X occasions, (D,c) on Y oc- 
casions, and (D,d) on Z occasions, then the expected utilities are 


EUC = 3W/ (WX) 
EUD = (5¥+2Z)/(¥4+2Z) 


First, consider how the maximization family exploits CC, 
beginning with MAD. By definition, CCC co-operates unconditionally, 
while MAD co-operates with probability 1/10 during the first one 
hundred moves. In game 10.1, seven outcomes of (Cc) and ninety-three 
outcomes of (D,c) have occurred. The score is correspondingly lop- 
sided. 


2 Note: the final scores of these sample games may differ 
slightly from the game scores between identical opponents in Appendix 
Two. Differences are due to expected fluctuations in the computer's 
pseudo-random generator, which is re-seeded on each occasion that a 
program containing a probabilistic component is run. 
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Game 10.1 — MAD versus CCC, Event Matrix After 100 Moves 


CC 
Cc d 
7 0 EUC = (3x7)/ (7+0) = 3 
MAD 
D 93 0 EUD = (5x93 + 1x0)/(93+0) = 5 


Score after 100 moves: MAD 486, CCC 21 


Note that the expected utility of co-operation is at its 
theoretical maximum (which is 3 for this particular payoff struc- 
ture), since no instances of (C.d) have occurred. A (C,d) outcome 
would reduce the expected utility of co-operation by its presence in 
the denominator of the EUC equation. Similarly, the expected utility 
of defection is also at its theoretical maximum (which is 5 for this 
particular payoff structure), since no instances of (D,d) have 
occurred. Although both EUs are at their respective maxima, EUD > 
EUC. Hence MAD defects on move one-hundred-and-one. 

After two hundred moves, the situation is as follows: 


Game 10.2 — MAD versus CCC, Event Matrix After 200 Moves 


ccc 
C d 
7 0 EUC = (3x7)/ (7+0) = 3 
MAD 
D 193 0 EUD = (5x193 + 1x0)/(193+0) = 5 


Score after 200 moves: MAD 986, CCC 21 


Between moves 101 and 200, there have been 100 successive 
outcomes of (D,c), with MAD having out-scored CCC by 500 to zero. CCC 
continues to co-operate, while MAD's expected utilities remain 
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unchanged. As on the previous one hundred moves, defection is pre- 
scribed for MAD on move two-hundred-and-one. 

By induction, it is obvious that the outcome (D,c) obtains for 
the duration of the game. Indeed, after 1000 moves, one finds: 


Game 10.3 — MAD versus CCC, Event Matrix After 1000 Moves 


ccc 
E d 
7 0 EUC = (3x7)/ (7+0) = 3 
MAD 
993 O EUD = (5x993 + 1x0)/(993+0) = 5 


Score after 1000 moves: MAD 4986, CCC 21 


CCC is thoroughly exploited by MAD. Again by induction (follo- 
wing the one hundred probabilistic moves), one can show that the 
other maximization family members also exploit CCC. Consider MEU's 
performance, after 100 and 1000 moves: 


Game 10.4 - MEU versus CCC, Event Matrices (100 & 1000 Moves) 


100 moves: 880 1000 moves: ccc 
Cc d c d 
C 50 0 50 0 
MEU MEU 
D 50 0 D 950 0 
EUC = 3, ED = 5 EUC = 3, EUD = 5 
Score: MEU 400, CCC 150 Score: MEU 4900, CCC 150 


The only difference between MAD's and MEU's performances 
against CCC lies in the frequency distribution of outcomes after 100 
moves. Since MEU is probabilistically more co-operative than MAD, 
more instances of (C,c) obtain during MEU's initial hundred moves. As 
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a result, CCC garners more points. But observe that MEU's expected 
utilities are identical to MAD's, and that they too remain constant 
throughout. Hence MEU defects from move 101 to the end of the game, 
and ultimately exploits CCC almost as thoroughly as does MAD. 

Similarly, MAE exploits CCC, but not quite as thoroughly as 
does MEU: 


Game 10.5 — MAE versus CCC, Event Matrices (100 & 1000 Moves) 


100 moves: CCC 1000 moves: CCC 
c d C d 
76 0 76 0 
MAE MAE 
24 0 D 924 0 
EUC = 3, EUD=5 EU = 3, EUD=5 
Score: MAE 348, CCC 228 Score: MAE 4848, CCC 228 


Once again, with the exception of the first hundred moves, 
MAE"s performance against CCC is identical to that of MAD and of MEU. 
Finally, consider the performance of MAC. As expected, MAC also 
exploits CCC, but does so to a slightly lesser extent than MAE: 


Game 10.6 — MAC versus CCC, Event Matrices (100 & 1000 Moves) 


100 moves: ccc 1000 moves: ccc 
c d c d 
C 87 0 87 0 
MAC MAC 
D 13 0 D 913 0 
EUC = 3, EUD=5 EUC = 3, EUD=5 


Score: MAC 326, CCC 261 Score: MAC 4826, CCC 261 
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All maximization family members achieve scores against CCC that 
are well in excess of their respective average scores in the main 
tournament. (Their average scores, from table 8.1, are: MAD, 2086; 
MEU, 2362; MAE, 2503; and MAC, 2645 points per game.) CCC, on the 
other hand, achieves scores against this family that are well below 
its tournament average (which is 1824 points per game). | 

Completely different situations obtain, however, when the 
maximization family competes against TFT. Because it is highly 
provocable, TFT is not exploitable. First, consider the performance 
of MAD after 100 and 1000 moves: 


Game 10.7 — MAD versus TFT, Event Matrices (100 & 1000 Moves 


100 moves: TFT 1000 moves: TFT 
Cc d c d 
C 2 11 2 11 
MAD MAD 
D 12 75 D 12 975 
EUC = .462, EUD = 1.55 EUC = .462, EUD = 1.05 
Score: MAD 141, TFT 136 : Score: MAD 1041, TFT 1036 


During the first hundred moves, MAD randomly defected on 87 
occasions and co-operated on 13 occasions. Since TFT responded in 
kind after its initial co-operation, it defected on 86 occasions and 
co-operated on 14 occasions. Thus, after one hundred moves, MAD leads 
IFT by only five points, which MAD accrued on the first move's (D,c) 
outcome. And, after one hundred moves, EUD well exceeds EUC. Hence 
MAD defects on move 101, and JFT follows suit. 

It transpires that both strategies become locked into mutual 
defection for the duration of the game. The event martrix for 1000 
moves differs from that for 100 moves only with respect to the number 
of (D,d) outcomes, which has increased by 900 in the 900 subsequent 
moves. The game score stands at MAD 1041, CCC 1036; MAD's narrow 
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margin of victory having been gained on the very first move of the 
game. 

That no outcome other than (D,d) occurred in the last 900 moves 
shows that EUD remained greater than EW from move 101 onward. 
However, an interesting and important phenomenon becomes manifest in 
this game. Note that the value of EUD actually diminishes between 100 
and 1000 moves (from 1.55 to 1.05). In fact, EUD decreases uniformly 
from move 101 onward, and it is the very occurrence of a (D,d) 
outcome that forces the decrease. In other words, the occurrence of a 
mutual defection has the effect of lowering the expected utility of 
further defection. 

In the case of MAD versus TFT, however, the manifestation of 
this phenomenon does not result in eventual mutual co-operation, 
since the value of the expected utility of co-operation shows no 
increase at all during the course of the game. One has already 
observed that the maximum possible values of EUC and EUD (for the 
current payoff structure) are 3am 5, respectively. One is now 
interested in the minimum possible value of EUD. If this minimum is 
smaller than 3, then a sufficient decrease in EUD combined with a 
sufficient increase in EW can result in the alteration of a maxi- 
mization family member's play, from defection to co-operation. 

Recall the equation for the expected utility of defection: 


EUD = (5%2/ (Y2 


This is a two-variable function, which takes on its maximum of 5 when 
Z equals zero (as in the case of MAD vesus CCC), and takes on its 
minimum of unity when Y equals zero. When both Y and Z are non-zero 
(as in the case of MAD versus TFT), the function approaches these 
extrema only in the limit, with one variable held constant and the 
other increasing without bound. For the minimum value: 


EUR = lim 230 OH) / 042) = 1 


Thus, if the number of mutual defections increases without 
bound, the expected utility of defection is driven toward its minimum 
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value of unity. This value is well below the maximum value of EUC, so 
it is indeed possible for a maximization family member to alter its 
play during the course of a game, from defection to co-operation. 

In the case of MAD versus TFT, one can now appreciate that 1000 
moves sufficed to drive the value of FEUD quite close to its lower 
limit. But since the value of EUC did not increase at all, MAD 
continued to defect. 

Now consider the performance of MEU against TFT: 


Game 10.8 — MEU versus TFT, Event Matrices (100 & 200 Moves) 


100 moves: TFT 200 moves: TFT 
C d cC d 
21 27 C 21 27 
MEU MEU 
27 25 D 28 124 
EUC = 1.31, EUD = 3.08 FUC = 1.31, ED= 1.73 
Score: MEU 223, TFT 223 Score: MEU 327, TFT 322 


During its first hundred moves, MEU has randomly co-operated on 
48 occasions, and defected on 52 occasions. The score is tied after 
100 moves. Since EUD is greater than EUC, MEU defects on move 101. 
MEU must have co-operated on move 100, since TFT co-operates on move 
101. Hence, at move 101, the (Dc) entry in the event matrix is 
incremented. Mutal defection ensues for the next 99 moves, during 
which EUD decreases from 3.08 to 1.73. After 200 moves, MEU leads TFT 
by the five points it accrued on move 101. 

Between moves 201 and 300, another 100 mutual defections occur. 
EUC remains constant at 1.31, while EUD drops to 1.44. MEU retains 
its five point advantage, and leads TFT by 427 to 422 after 300 
moves. During the next two hundred moves, the following events take 
place. 

Between moves 301 and 400, another 100 mutual defections have 
occurred. But the value of EUD has been driven very close to that of 
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EUC. Following seven more mutual defections, at move 407, the value 
of EUD falls below that of EUC (1.311 to 1.312, respectively). In 
consequence, at move 408, MEU co-operates. But TFT defects at move 
408, in response to MEU's previous defection at move 407. Thus the 
event matrix outcome of (C,d) is incremented (from 27 to 28) at move 
408. But this increment results ina decrease in the value of the 
expected utility of co-operation, from 1.312 to 1.286. 


Game 10.9 — MEU versus TFT, Event Matrices (400 & 500 Moves 


400 moves: TFT 500 moves: TFT 
c d c d 
21 27 C 21 29 
MEU MEU 
D 28 324 D 30 420 
EUC = 1.312, EUD = 1.318 EUC = 1.26, EUD = 1.267 
Score: MEU 527, TFT 522 Score: MEU 633, TFT 628 


Naturally, from MEU's point of view, a co-operative move on its 
part in tandem with a defection by its opponent is detrimental to 
MEU 5 expected utility of co-operation. (In fact, TFT has gained five 
points on this move, and has tied the score.) Hence, at move 409, EUD 
is once again greater than EUC, by 1.311 to 1.286,and MEU defects 
anew. But TFT co-operates on move 409, in response to MEU'S previous 
co-operation. Thus the event matrix outcome of (D,c) is incremented 
(from 28 to 29) on move 409, and MEU regains a five-point lead. 

But this increment, not surprisingly, results in an increase in 
MEU s expected utility of defection, from 1.311 to 1.322. Another 
sequence of mutual defections follows, until the value of EUD is once 
again driven below that of VC. Then the above process repeats 
itself. At this juncture, the strategies have reached the half-way 
mark of their game. 

Although EUC has superseded EUD on two separate occasions, no 
instances of mutual co-operation have occurred. Clearly, AUC must 
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remain greater than EUD on two consecutive occasions for mutual co- 
operation to occur. But the existing distribution of outcomes for 
these strategies, which eventually gives rise to the sequence (D,d), 
(Cd, (Do, (Dd), in fact precludes the possibility of (C,c) 
occurring, because the distribution is re-enforced by the very 
sequence it produces. Consider the final one hundred moves of the 
game: 


Game 10.10 — MEU versus TFT, Event Matrices (900 & 1000 Moves) 


900 moves: TFT 1000 moves: IFT 
C d cC d 
21 34 C 21 34 
MEU MEU 
35 810 D 35 910 
EUC = 1.145, EUD = 1.657 EUC = 1.145, EUD = 1.148 
Score: MEU 1048, TFT 1043 Score: MEU 1148, TFT 1143 


In sum, a string of mutual defections periodically drives the 
value of EUD below that of EUC. Then, an outcome of (C,d) ensues, 
which in turn depresses the value of LEW. An outcome of (D,c) fol- 
lows, which temporarily inflates the value of EUD. Another string of 
mutual defections ensues, and the pattern repeats until the game 
ends. Both expected utilities are driven toward their minimum limit- 
ing value of unity. MEU defeats TFT by the five points it accrues on 
the ultimate occurrence of (D.0). 

It transpires that MAF's performance against TFT unfolds ina 
Similar fashion, with one significant difference. During its first 
hundred moves, MAE co-operates randomly with a probability of 5/7, as 
compared with MEU's 1/2. This results in a comparatively more co- 
operative distribution of outcomes for those one hundred moves, which 
in turn increases the frequency of the movement away from mutual 
defection. Consider the first two hundred moves of MAE's performance 
against TFT. 
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Game 10.11 — MAE versus TFT, Event Matrices (100 & 200 Moves) 


100 moves: . on IFT 200 moves: IFT 
Cc d c d 
44 22 C 44 27 
MAE MAE 
22 12 D 28 101 
EUC = 2.00, EUD = 3.588 EUC = 1.859, EUD = 1.868 


Score: MAE 254, TFT 254 Score: MAE 373, TFT 368 


Observe that the sequence of outcomes (D,d, (Cd), (Do. 
(D,d), which did not occur until after the four hundredth move in the 
game between MEU and TFT, occurs five times before the two hundredth 
move in the game between MAE and TFT. This pattern repeats, with 
comparatively greater frequency, for the duration of the game. 
Consider the final one hundred moves: 


Game 10.12 — MAE versus TFT, Event Matrices (900 & 1000 Moves) 


900 moves: TFT 1000 moves: TFT 
Cc d c d 
44 58 C 44 60 
MAE MAE 
59 739 D 61 835 
EUC = 1.294, EUD = 1.295 EUC = 1.269, EUD = 1.272 
Score: MAE 1166, TFT 1161 Score: MAE 1272, TFT 1267 


From move 101 onward, a total of 77 departures from mutual 
defection occurred between MAE and TFT, as compared with none between 
MAD and TFT, and only 15 between MEU and TFT. Nonetheless, MAE is 
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unable to make the two consecutive co-operative moves necessary to 
engender an occurrence of mutual co-operation. In consequence, the 
MAE-TFT pair attains slightly higher scores than the MAD-IFT and MEU- 
IFT pairs,. but all these scores remain well below the average main 
tournament scores of the strategies involved. 

When MAC competes against TFT, a rather different picture 
emerges. Consider the first two hundred moves of their encounter: 


Game 10.13 - MAC versus TFT, Event Matrices (100 & 200 Moves) 


100 moves: TFT 200 moves: TFT 
c d c d 
84 7 C 173 8 
MAC MAC 
D 7 2 8 11 
EUC = 2.769, EUD = 4.111 EUC = 2.867, EUD= 2.684 
Score: MAC 289, TFT 289 Score: MAC 570, TFT 570 


In Game 10.13, novel circumstances arise. During its first 
hundred moves, MAC randomly co-operates on 91 occasions, and defects 
on 9 occasions. TFT, of course, replies in kind. But owing to the 
preponderance of MAC's co-operations over its defections, MAC is 
bound to co-operate on a fair number of consecutive occasions. 
Combined with 77T's play, the result is a relatively large number of 
(C,c) outcomes during the first hundred moves. Indeed, although MAC's 
initial co-operative weighting is only 18% higher than MAEF's, the 
MAC-TFT pair realizes almost twice as many mutually co-operative 
outcomes as the MAE-TFT pair (84 to 44, during the first hundred 
moves). 

Although MAC's expected utility of defection is greater than 
its expected utility of co-operation at move 101, the distribution of 
outcomes after 100 moves is co-operative enough for the pair to lock 
into mutual co-operation within a dozen further moves. They do so in 
the following way. 
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MAC defects at move 101. Since MAC co-operated at move 100, TFT 
co-operates at move 101. ‘The (D,c) outcome gives MAC a five-point 
lead, and temporarily inflates the value of the expected utility of 
defection, from 4.11 to 4.2. A string of mutual defections ensues, 
which forces the value of the expected utility of defection steadily 
downward, while the value of the expected utility of co-operation 
remains unchanged (at 2.769). At move 111, EWD is finally less than 
EUC (2.684 to 2.769), so MAC co-operates. Owing to MAC's defection at 
move 110, TFT defects at move 111. The (C,d) outcome allows TFT to 
tie the score, and depresses the value of the expected utility of co- 
operation, from 2.769 to 2.739. 

In the games involving MAD, MEU and MAE against TFT, this 
cyclical depression forced the value of EUC below that of EUD, 
resulting in a (D, c) outcome on the subsequent move, followed by 
another string of mutual defections. But in this game, given the 
overwhelmingly co-operative distribution of outcomes after the 
initial 100 moves, the value of the expected utility of co-operation 
remains greater than that of defection, by 2.739 to 2.684. Thus MAC 
co-operates at move 112. Finally, two consecutive co-operative moves 
have been prescribed by the maximization calculus. Since MAC co- 
operated at move 111, TFT co-operates at move 112. The outcome (C,c) 
results. 

This mutually co-operative outcome drives the value of EM 
upward, from 2.739 to 2.741, while the value of EUD remains unchanged 
at 2.684. Hence MAC co-operates at move 113. And since MAC co-opera- 
ted at move 112, TFT also co-operates at move 113. This (C,c) outcome 
further increases the value of EUC, and so forth. The strategic pair 
is locked into mutual co-operation, which continues for the duration 
of the game. 

Note that the value of EUC increases steadily owing to the 
lengthy string of mutual co-operations. By the end of the game, the 
expected utility of co-operation is quite close to its maximum 
limiting value of three. Note also that both MAC and TFT attain 
scores in this game (a draw at 2970 points) which are comparable to 
scores attained by a pair of nice strategies (a draw at 3000 points). 


The MAC-TFT pair realizes scores well above either of their main 
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tournament averages. 


Game 10.14 — MAC versus TFT, Event Matrix After 1000 Moves 


TFT 
Cc d 
973 8 
MAC 
8 11 


EUC = 2.976, EUD = 2.684 


score after 1000 moves: MAC 2970, IFT 2970 


The move-by-move interactions of the maximization family 
members with TFT are summarily displayed in the following graph, 
which plots the algebraic difference in expected utilities (EUC minus 
EUD) as a function of the number of moves made (in increments of 100 
moves), for each of the four members versus TFT: 


Graph 10.1 Maximization Family Versus TFT 
Difference Between Expected Utilities 
At 100-Move Intervals 


Region of Co-operation 


Region of Defection 


-2.0 
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Obviously, when the difference EUC minus EUD is greater than 
zero, a maximization family member co-operates; when less than zero, 
it defects. Thus, in graph 10.1, the abscissa marks a boundary 
between regions of co-operation and defection. After 100 initial 
moves, all four strategies find themselves in the region in which the 
maximization of expected utility prescribes defection. Whether a 
strategy proceeds to the boundary or not, and whether a strategy 
crosses the boundary or not, depends upon the initial distribution of 
outcomes in its event matrix, which in turn depends upon its initial 
probabilistic weighting. 

The MAD-TFT pair is mired in perpetual mutual defection. MAD's 
EUC remains fixed at .462 (for the distribution that obtains in game 
10.7), while MAD's EUD approaches its asymptotic minimum of unity. 
Thus the difference approaches -0.538 as the number of moves im 
creases. MAD will never be able to venture within .538 utiles of the 
co-operative border. 

The MEU-TFT and MAE-TFT pairs both manage to approach the 
border, and even to straddle it on occasion. MAE, being initially 
more co-operative than MEU, approaches it more quickly and straddles 
it more frequently. However, while both strategies’ expected utili- 
ties converge toward their asymptotic values of unity, EUD converges 
more slowly, in the mean, than does EUC. Thus the differences between 
EUC and EUD, for both MAE and MEU, remain mostly negative. Although 
both strategies manage occasional co-operative moves, neither stra- 
tegy is able to extricate itself from the region of defection. 

The MAC-IFT pair quickly traverses the border, and remains 
thereafter in the co-operative region. MAC's EUC approaches its 
limiting maximum value of 3, while MAC's EUD remains constant (after 
move 110) at 2.684 (for the distribution that obtains in game 10.13). 
Thus the difference between EUC and EUD increases toward the asymp- 
tote +0.316. The longer the game continues, the more deeply MAC moves 
into the region of co-operation, subject to its limit of 0.316 utiles 
from the border. 

Now, to place the results of the maximization family's perfor- 
mances against CCC and TFT into perspective, consider the total and 
average scores that each member obtained against them, compared with 
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each member's average main tournament score: 


Table 10.1 — Comparison of Scores 


vs. vs. Average Tournament 
CCC TFT 34 (CCC+IFT) Average 
MAC 4826 2970 3898 2645 
MAE 4848 1272 3060 2503 
MEU 4900 1148 3024 2362 
MAD 4986 1041 3014 2086 


It is evident from table 10.1 that the maximization family 
members realize substantial average gains when they compete against 
exploitable and non-exploitable strategies in a one-to-one ratio. 
Their relative gains : against CCC far outweigh their relative losses 
against TFT. And note that the average scores, M(CCC+TFT), increase 
with the co-operativeness of the maximization members’ weightings. 
MAD is the most exploitive member of its family; MAC, the least 
exploitive. Against the exploitable CCC, MAD scores 160 points more 
than MAC. But against the non-exploitable TFT, MAD scores 1929 fewer 
points than MAC. Thus MAC's initial co-operativeness does not greatly 
impair MAC's ability to exploit CCC, while it greatly enhances MAC's 
performance against TFT. 

But do the maximization family members satisfy Axelrod's 
hypothetical criterion of success? Are they able to exploit the 
exploitable without paying too high a cost with the others? If the 
maximization family's performances against CCC and TFT are fair 
indicators, then the answer to this question seems to depend upon the 
ratio of "exploitable" to "other" strategies in the environment. If 
the ratio is one-to-one, then the answer is in the affirmative for 
the whole family, and especially so for MAC. 

However, a glance at the last two columns of table 10.1 shows 
that the maximization family members all realize lower average scores 
against the twenty strategies of the main tournament than against CCC 
and TFT only. This indicates that the ratio of exploitable to non- 
exploitable strategies is less than one-to-one in the main tourna- 
ment. Given that MAC won the main tournament handily, and that MAE 
placed second, it does appear that both these strategies satisfy 
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Axelrod's criterion; whereas both MEU and MAD pay far too high a 
price for their slightly greater exploitiveness. 

In light of this microscopic examination of the maximization 
family's mechanics, the reasons for MAC's success in the interactive 
tournament (and the graduated performances of MAC's siblings) become 
clearer. But MAC's success also entails a certain cost. Ironically, 
that cost is exacted not by "others" belonging to different strategic 
families, but by MAC's siblings and its own twin. Consider how the 
maximization family members fare against one other, compared with 
their average main tournament scores: 


Table 10.2 — The Maximization Family Versus Itself 


MAC MAE MEU MAD Family Tour. 
Avg. Avg. 
MAC 1807 1849 1741 971 1592 2645 
MAE 2123 2594 2356 987 2015 2503 
MEU 1887 2396 2384 1003 1918 2362 
MAD 1332 1266 1181 1029 1202 2086 


Ås table 10.2 reveals, the maximization family members' average 
Scores against one another are considerably below their average main 
tournament scores. In particular, MAC's is 1053 points lower. In 
intra-family competition, MAC ranks third among four siblings. And in 
auto-competition, the MAC-MAC pair also ranks third behind MAE-MAE 
and MEU-MEU. 

Thus Axelrod's dictum, that there is no "best" strategy in- 
dependent of environment, continues to ring true. MAC proved its 
robustness in hundreds of thousands of combinatoric sub-tournaments, 
and in thousands of generations of ecosystemic competition. But in an 
environment consisting solely of its family members, MAC loses every 
competition against its siblings and fares poorly against itself. 

This result suggests that if Axelrod's hypothetical criterion 
of success is to have broader applicability, then it should be 
amended. If a strategy could be devised which exploits the exploit- 
able without paying too high a cost with the others, and which 
emerges victorious in a sub-tournament against the maximization 


Siblings, then that strategy would win the interactive tournament, 
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and would probably win many other tournaments as well. Such a feat, 
however, may be more easily articulated than accomplished. 

MAC still remains the most robust strategy in the interactive 
environment, but it is somewhat surprising to find that MAC's success 
is most jeopardized by its siblings and its twin. The task of the 
next chapter is to discover why the members of the maximization 
family encounter their greatest difficulties in competition against 
one another. 
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_ Chapter Eleven 
The Maximization Family Versus Itself 


In order to understand what takes place when a maximization 
family member encounters a sibling, or its twin, one must recognize a 
unique property of this family; namely, its members’ sequential and 
mutually exclusive use of probabilistic, then deterministic algo- 
rithms. To clarify the meaning of this property, one can first 
qualify the algorithms used by other strategic families. 

The probabilistic family members function by random methods. 
TRC, RAN and MD do so exclusively. As has been mentioned, an alter- 
native interpretation can be made for CCC and DDD. From the viewpoint 
of program logic, they co-operate with probabilities unity and zero, 
respectively. From the viewpoint of ends rather than means, they are 
also pure strategies, whose play is therefore not deterministic; 
rather, pre-determined. 

The TFT family members function deterministically, with the 
exception of HAE. BBE employs a deterministic rule with a probabilis- 
tic condition attached, and thus mixes two kinds of algorithm. 

In the optimization family, NYD is strictly deterministic, 
while CHA, GRO and ETH make simultaneous use of both determinism and 
probabilism. 

The strategic hybrids, FRI and TES, function according to 
strictly deterministic rules. 

Maximization family members, however, can be regarded as 
algorithmic hybrids. They employ a purely probabilistic rule for 
their first hundred moves, then shift to a strictly deterministic 
calculus for the duration of the game. But unlike HABE, CHA, GRO and 
ETH, the maximization strategies never mix these two kinds of al- 
gorithm; their use of the two is always sequential and mutually 
exclusive. 

This property naturally gives rise to two discernibly different 
phases in a maximization strategy's play: first, its construction of 
the initial event matrix for 100 moves; second, its calculation of 
expected utilities, and updating of the matrix, for the subsequent 
900 moves. These phases were observed in the previous chapter, during 
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encounters between maximization strategies, CCC and TFT. But when 
maximization family members encounter one another, the phasing of 
their play takes ona dual aspect, wherein certain symmetries, as 
well as anti-symmetries, become apparent. New and interesting proper- 
ties of the event matrix are thereby revealed, and a deeper under— 
standing of the results of these intra-familial encounters is achiev- 
ed. | 

In sum, one can identify five different kinds of algorithmic 
function in the interactive environment: pre-determined, probabilis- 
tic, deterministic, mixed probabilistic and deterministic, and 
sequential probabilistic and deterministic. The reason for this 
identification is quite important. 

If two pre-determined and/or deterministic strategies are 
paired ina sequence of games, the scores of a given pair will 
obviously not vary from one game to another. For example, if DDD 
plays TFT, their score is always the same: DDD 1004, TFT 999. 

If a probabilistic (or mixed probabilistic and deterministic) 
strategy is paired with any strategy other than a sequential strategy 
in a sequence of games, the scores of the given pair will vary 
according to a normal distribution, in which the mean score tends 
toward the most probable score, as the number of games increases. 

For a simple example, consider two probabilistic strategies. If 


RAN plays TX, then the a priori probabilities of the outcomes are as 
follows: 


Game 11.1 — Probability Matrix for RAN versus TC 


TNC 
P(c)=3/4 p(d)=1/4 


P(C)=1/2 3/8 1/8 


p(D)=1/2 3/8 1/8 
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Ina game of 1000 moves, the most probable distribution of 
outcomes is therefore: (C,c) and (D,c), 375 occasions each; (C,d) and 
(D,d), 125 occasions each. Hence RAN'S most probable score is 3x375 + 
5x375 + 0x125 + 1x125 = 3125. NX's most probable score is 3x375 + 
0x375 + 5x125 + 1x125 = 1875. The actual score obtained in their main 
tournament game is RAN 3139, NX 1914. But a sequence of 100 games 
produces the mean score RAN 3122, TX 1877, with both sets of scores 
distributed fairly normally about their respective means: 

Graph 11.1 RAN versus TQC 

Histogram of RAN's Scores for 100 Games 


RAN's Average Score: 3122 
Frequency 


30 

25 

20 
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Range of Scores 


Graph 11.2 TQC versus RAN 
Histogram of TQC's Scores for 100 Games 
TQC's Average Score: 1877 
Frequency 
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Range of Scores 
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Again, given a normal distribution, the difference between the 
most probable score and the mean score tends to decrease with the 
number of games played. 

For a more complex example, consider a pair of strategies that 
use mixed deterministic and probabilistic algorithms, for instance 
BDE and ETH. In such a case, one cannot readily construct a matrix of 
@ priori probable outcomes, but one can make an empirical test to see 
whether a normal distribution of scores obtains. The following 


histograms show the distributions of scores for 100 games between BBE 
and ETH: 


Graph 11.3 BBE versus ETH 
Histogram of BBE's Scores for 100 Games 
BBE's Average Score: 3185 


Frequency 
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40 
30 
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Graph 11.4 ETH versus BBE 


Histogram of ETH's Scores for 100 Games 
ETH's Average Score: 2688 


Frequency 
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Once again, these scores appear to be distributed normally. 

Now, consider what takes place between a maximization strategy 
and a strategy that uses a mixed probabilistic and deterministic 
algorithm, for instance MAE and CHA. Again, an empirical test is 
conducted, and the following histogram shows CHA's distribution of 
scores for 100 games against MAE: 


Graph 11.5 CHA versus MAE 
Histogram of CHA's Scores for 100 Games 
CHA's Average Score: 2447 
Frequency 


0 
2100 2200 2300 2400 2500 2600 2700 2800 2900 
Range of Scores 


One finds CHA's scores to be normally distributed. 

MAE"S scores against CHA are comparatively highly-concentrated. 
Ninety-nine scores lie in the 2850-3000 point range; one, in the 
3000-3100 point range. 

The maximization family members' scores against one another, 
however, are neither concentrated nor distributed normally, with one 
noteworthy exception. In consequence, their average scores do not, as 
a rule, approach their most probable scores as the number of games 
increases. 

Let the exception to the rule, which occurs in games involving 
MAD, be considered first. The extreme case of this exception obtains 
when MAD plays itself. After the first one hundred moves, the most 
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probable event matrix is as follows: 


Game 11.2 — Most Probable Event Matrix for MAD versus MAD (100 Moves) 


MAD 
p(c)=1/10 p(d)=9/10 
p(C)=1/10 1 9 
MAD 
p(D)=9/10 9 81 


EUC = 0.3, HD = 1.4 


Score tied at 129 


The deterministic play that ensues from this matrix, from moves 
101 to 1000, consists of 900 consecutive mutual defections. The game 
ends with the score tied at 1029. Since this score is a deterministic 
end-product of the most probable event matrix, it is the most prob- 
able score. Empirically, after five hundred games, MAD's average 
score is found to be 1029. The scores themselves appear to be dis- 
tributed normally, as the following histogram reveals: 


Graph 11.6 MAD versus MAD 
Histogram of Scores for 100 Games 
MAD's Average Score: 1029 
Frequency 
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Next, consider the most probable event matrix for MEU versus 
MEU, after 100 moves: 


Game 11.3 — Most, Probable Event Matrix for MEU versus MEU (100 Moves) 


MEU 
p(c)=1/2 p(d=1/2 
p(0)=1/2 25 25 
MEU 
P(D)=1/2 25 25 


EUC = 1.5, EUD= 3 
Score tied at 225 


The deterministic phase of game 11.3 proceeds as follows. One- 
hundred-and-fifty consecutive mutual defections obtain between moves 
101 to 250, with a concomitant steady decrease in the value of HD. 
By move 251, the value of EUD is forced below that of EUC, and 750 
consecutive mutual co-operations ensue. After 1000 moves, the score 
is tied at 2625. Again, it is the most probable score. 

Empirically, however, after 500 games of MEU versus MEU, the 
average score is found to be 2384. This is substantially less than 
the most probable value. The cause of the discrepancy is revealed in 
a histogram showing the distribution of scores for 500 games of MEU 
versus MEU. 

Graph 11.7 displays a bi-modal distribution, with a minor 
prominence in the 1100-1200 point range, anda skewed distribution 
across the middle and upper ranges. The peak of the skewed distribu- 
tion indeed coincides with the most probable (a priori) score, in the 
2600-2700 point range. But the minor feature at the low end of the 
range, along with the overall skewness, diminishes the average score. 
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i Graph 11.7 MEU versus MEU 
Histogram of Scores for 500 Games 
MEU's Average Score: 2384 


Next, consider the most probable event matrix for MAE versus 
MAE, after 100 moves: 


Game 11.4 — Most Probable Event Matrix for MAE versus MAE (100 Moves) 


MAE 
p(c)=5/7 P(d)=2/7 
p(0)=5/7 52 20 
MAE 
P(D)=2/7 20 8 


EUC = 2.17, EUD = 3.86 


Score tied at 264 
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Game 11.4 proceeds in this way. Forty-one mutual defections 
take place between moves 101 and 141, followed by 859 mutual co- 
operations. After 1000 moves, the score is tied at 2882 points. 
Again, this represents the most probable score. 

Empirically, however, after 500 games of MAE versus MAE, the 
average score is found to be 2594 points. Again, a histogram reveals 
the cause of the discrepancy between the most probable and the 
average scores: 


Graph 11.8 MAE versus MAE 
Histogram of Scores for 600 Games 
MAE's Average Score: 2604 


0) 2 av ‘on 
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Range of Scores x 1/100 


Graph 11.8 displays a skewed distribution. While the most 
frequent scores by far occur in the 2800-2900 point range, which is 
the range of the most probable score, the skew of the distribution 
toward the lower ranges diminishes the average score by some 250 
points. 

A closer look at the histogram affords a more detailed inter- 
pretation. The minor prominence from the previous histogram (located 
in the 1000-1300 point range of graph 11.7) may have experienced a 
radical decrease, and migrated to the 1300-1600 point range. Indeed, 
other features of increasing prominence appear inthe 1900-2000, 
2300-2400, and 2600-2700 point ranges. 
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It is not yet possible to judge whether these features merely 
denote statistical irregularities in the profile of a badly-skewed 
distribution, or whether they indicate that the distribution itself 
is beginning to become fragmented. 

Finally, consider the most probable event matrix for MAC versus 
MAC, after 100 moves: 


Game 11.5 — Most Probable Event Matrix for MAC versus MAC (100 Moves) 


MAC 


P(c)=9/10 p(d)=1/10 


p(0)=9/10 8 9 
MAC 
p(D)=1/10 9 1 


EUC = 2.7, EUD= 4.6 
Score tied at 289 


In game 11.5, mutual co-operation commences on move 113, after 
only twelve consecutive mutual defections. The deterministic string 
of 888 mutual co-operations between moves 113 and 1000, in addition 
to the 81 probabilistic mutual co-operations during the first 100 
moves, yields a total of 969 instances of mutual co-operation in a 
game of 1000 moves. The resultant score, which again represents the 
most probable score, is tied at 2965 points. 

MAC versus MAC, however, yields the largest empirical deviation 
in its family. After 500 games of MAC versus MAC, the average score 
is found to be 1807 points, a remarkable difference of 1158 points 
between most probable and average scores. Again, a histogram reveals 
the cause of this large discrepancy. 
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1 Graph 11.9 MAC versus MAC 
o Histogram of Scores for 600 Games 
MAC's Average Score: 1807 


0 
11 12 18 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 2 
Range of Scores x 1/100 


Graph 11.9 shows a multi-modal distribution of scores, with 
prominent features in the 1300-1400, 1600-1700, 2300-2400, and 2900- 
3000 point ranges. In addition, troughs appear between 1900-2200 and 
2600-2700 points, from which ranges scores seem to be excluded. The 
histogram clearly illustrates why the average score for MAC versus 
MAC is well below the most probable å priori score. And in retros- 
pect, it seems that the previous histogram (graph 11.8) shows signs 
of the impending fragmentation. But this illustration merely begs the 
question: why does the distribution become so fragmented? 

Indeed, this is one of å number of questions raised by an 
examination of the distribution of scores among members of the 
maximization family. In the four cases considered, in increasing 
order of initial co-operative weighting, one finds: first, a con- 
centration of scores at the low end of the scale; second, a skewed 
bi-modal distribution with a minor prominence at the low end; third, 
a skewed distribution which may be in the preliminary stages of 
fragmentation; and fourth, a multi-modal distribution which has 
fragmented into several distinct features. One may ask why these 
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differences occur, given that each distribution represents a range of 
deterministic results stemming from a domain of probabilistic initial 
conditions. What causes such pronounced changes in the profiles of 
the distributions? 

Answers can be found in deeper analysis of the event matrix. 
There are 176,851 different combinations of 100 trials of the four 
possible outcomes; in other words, for the first 100 moves in the 
iterated prisoner's dilemma, there are 176,851 possible event matri- 
ces. To facilitate analysis, one seeks to formulate a few general 
principles that extend to the many different cases. 

First, consider those matrices which are symmetric across their 
major diagonals; that is, event matrices in which the numbers of 
(C,d) and (D,c) outcomes are identical after 100 moves. Such matrices 
obtain from å priori | probabilistic encounters between maximization 
family twins. As a most general example, suppose that any maximiza- 
tion strategy MAX, with an initial co-operative weighting of p, meets 
its twin. Then, during their first hundred moves, both strategies co- 
operate randomly with probability p, and defect with probability 
(1-p). The most probable event matrix is: 


Game 11.6 — Most Probable Event Matrix for MAX versus MAX (100 Moves) 


MAX 
p(c)=p p(d)=1—p 
ptO=p 100% 100p(1-p) 
MAX 
p(D)=1-p  100p(1-p) 100(1-p)? 


EUC = 3p, EUD = 4pt+1 


Score tied at 100(1+3p-ø ) 


Particular members of this class of event matrix have already 
been encountered in games 11.2 through 11.5, inclusively. The sig- 


nificance of symmetry across the major diagonal is as follows. When 
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the number of (Cd) outcomes equals the number of (D,c) outcomes, 
then both competitors have the same expected utility of co-operation, 
and the same expected utility of defection. In consequence, from move 
101 onward, their joint play is identical, with outcomes of either 
(D,d) or (C,c). 

Precisely this process unfolds in the a priori evaluations of 
most probable scores for MAD versus MAD, MEU versus MEU, MAE versus 
MAE, and MAC versus MAC. One naturally observes increasing scores 
(1029, 2625, 2882, and 2965 points respectively) as the co-operative 
weighting increases. MAD's most probable score against its twin is 
far lower than MAD's siblings' most probable scores against their 
respective twins because, unlike MAD, the other siblings sooner or 
later attain mutual co-operation with their respective twins. 

Empirically, it is found that the threshold weighting for the 
eventual attainment of mutual co-operation is p = 37/100 (in a game 
of 1000 moves with the payoffs of game 7.2). This is not a highly co- 
operative weighting; nevertheless, it does result in mutual co- 
operation from move 719 onward. The initial and final event matrices 
are as follows: 


Game 11.7 — MAX versus MAX (p=37/100), Initial & Final Event Matrices 


100 moves: MAX 1000 moves: MAX 
Cc d Cc d 
14 23 296 23 
MAX MAX 
23 40 23 658 
EUC = 1.14, ED = 2.46 EUC = 2.78, EUD= 1.14 
Score tied at 197 Score tied at 1661 


Now, compare this result with that of a game in which the 
initial co-operative weighting of the competitors is 36/100, or just 
under the threshold value: 
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Game 11.8 — MAX versus MAX (p=36/100), Initial & Final Event Matrices 


100 moves: . a MAX 1000 moves: MAX 
Cc d Cc d 
13 23 C 13 23 
MAX MAX 
23 41 D 23 941 
EUC = 1.08, EUD = 2.44 EUC = 1.083, EUD = 1.095 
Score tied at 195 Score tied at 1095 


While the initial conditions of games 11.7 and 11.8 scarcely 
differ, the final results admit of considerable difference. Having 
established that the minimum threshold weighting of p = 37/100 leads 
to the eventual attainment of mutual co-operation at move 719, one 
might next find the maximum rapidity with which such co-operation can 
be attained. 

The highest admissible value of p, to the nearest 1/100, is p= 
99/100. (If p equals unity, then EUD is undefined owing to division 
by zero). At this maximum value of p, the following matrices obtain: 


Game 11.9 — MAX versus MAX (p=99/100), Initial & Final Event Matrices 


100 moves: MAX 1000 moves: MAX 
C d c d 
G. 98 1 C 996 1 
MAX MAX 
D 1 0 D 1 2 
EUC = 2.97, ED = 5 EUC = 2.996, EUD = 2.333 


Score tied at 299 Score tied at 2995 
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In game 11.9, only two mutual defections, at moves 101 and 102, 
suffice to initiate perpetual mutual co-operation. 

Evidently, the number of mutual defections required to bring on 
mutual co-operation is a decreasing exponential function of initial 
co-operative weighting. An exponential curve-fit yields the following 
equation: 


n= f(p) = 70936 "MP for 37/100 < p< 1 


where nm is the number of mutual defections between move 101 and the 
onset of perpetual mutual co-operation and p is initial co-operative 
weighting. The co-efficient of determination for this exponential 
equation is r= .985. 

Similarly, the final scores that result from these initial 
distributions can be fitted to a second exponential curve: 


s = gtf(p)] = 30076 "AA 


where s is the score after 1000 moves. The coefficient of determina- 
tion for this expression is r= .9997. 

Needless to say, the numerical coefficients of both curves 
depend upon the particular payoff structure and the length of the 
game, but the form of the curves is independent of these variables. 
In general, then, both the play that ensues from event matrices 
exhibiting symmetry across their major diagonals, and the scores 
which result from this play, conform to simple mathematical expres- 
sions. This class of event matrix gives rise to regular and readily 
comprehensible outcomes. 

The broader class of event matrices, whose members do not 
exhibit symmetry across their main diagonals, is unfortunately (from 
the viewpoint of simplicity) the far larger of the two classes. The 
event matrices in this class give rise to the non-normal distribu- 
tions displayed in graphs 11.7 through 11.9. It is possible (and 
desirable) to gain an understanding of how these distributions arise 
without having to analyze tens of thousands, nor even thousands, of 
such matrices. Fortuitously, the process can be well-represented by 
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the tabling of results of a few dozen small probabilistic fluctua- 
tions about the most probable outcome, 
pairs. 
Recall 
matrix: 
(D, c) 
MEU: 


for each of the strategic 


the notation for entries inthe generalized event 
W, X, Y and Zare the respective numbers of (C,c), (Cd), 


and (D,d) outcomes. One first considers the case of MEU versus 


Table 11.1 — MEU versus MEU, Varying Event Matrices and Scores 


Initial Perpetual Final Initial Perpetual Final Initial Perpetual Final 

EYZ (Ga Score ELEZ (Cd Score ELEZ (Ga Score 

20,33,32,15 none 1139 - 1139 25,30,30,15 move 386 2370 - 2370 30,28,27,15 move 262 2622 - 2622 
20,34,31,15 none 1138 - 1143 25,31,29,15 move 423 2299 - 2299 30,29, 26,15 move 281 2587 - 2587 
20,35,30,15 none 1137 - 1147 25,32,28,15 move 464 2220 - 2220 30,30,25,15 move 301 2550 - 2550 
20,36,29,15 none 1136 - 1151 25,33,27,15 move 510 2131 - 2131 30,31,24,15 move 323 209 - 09 
20,37,28,15 none 1131 - 1156 25,34,26,15 move 562 2030 - 2030 30,32,23,15 move 347 2464 - 2464 
20,30,30,20 move 651 1830-1830 25,28,27,20 move 324 2489 - 2408 30,25,5,20 move 213 2709 - 2709 
20,31,29,20 move 755 16% - 1625 25,29,26,20 move 354 2431 - 2431 30,26,24,20 move 229 2682 - 2682 
20,32,28,20 move 805 1368 - 1368 25,30,25,20 move 386 2370 - 2370 30,27,23,20 move 245 2653 - 2653 
20,33,27,20 none 1139 - 1139 25,31,24,20 move 423 2299 - 2299 30,28,22,20 move 262 2622 - 2622 
20,34,26,20 none 1138 - 1143 2,32,23,20 move 464 2220 - 2220 30,29,21,20 move 281 2587 - 587 
20,28,27,5 move 497 0232-2132 %,5,5,5 move 251 2625 - 2625 30,23,22,% move 185 2759 - 2759 
20,29,26,25 move 567 19% - 1995 25,26,24,25 move 273 284 - 284 30,24,21,25 move 199 2736 - 2736 
20,30,25,25 move 651 1830-1830 25,27,23,25 move 298 2537 - 2537 30,25,20,25 move 214 2709 - 2709 
20,31,24, move 755 1625 - 1625 25,28,22,25 move 324 2488 - 2488 30,26,19,25 move 229 2692 - 2682 
20,32,23, move 885 1368 - 1368 25,29,21,2 wove 354 2431 - 2431 30,27,18,5 move 245 2653 - 2653 
20,25,25,30 move 346 24% - 245 %,23,22,30 wove 213 269 - 2695 30,20,20,30 move 151 2820 - 2820 
20,26,24,30 move 389 2342 - 2342 25,24,21,30 move 231 2662 - 2662 30,21,19,30 move 162 2801 - 2801 
20,27,23,30 move 439 2245 - 2245 %,25,20,30 move 251 26% - 2625 30,22,18,30 move 168 2788 - 2788 
20, 28,22,30 move 497 2132 - 2132 25,26,19,30 move 273 2504 - 2594 30,23,17,30 move 186 2759 - 2759 
20,29,21,30 move 567 1995 - 1995 25,27,18,30 move 298 2537 - 2537 30,24,16,30 move 199 2736 - 2736 
20,23,22,35 move 277 257 - 2557 25,20,20,35 move 166 2780 - 2780 30,18,17,35 move 127 2950 - 2863 
20,24,21,35 move 309 249 - 2496 25,21,19,35 move 181 2753 - 2753 30,19,16,35 move 127 2853 - 2868 
20,2,20,3 move 346 24% -245 5,22,18,35 move 196 2726 - 2726 30,20,15,35 move 126 2850 - 2875 
20,26,19,35 move 389 2342 - 2342 2,23,17,35 move 213 26% - 2695 30,21,14,35 move 126 2045 - 2800 
20,27,18,35 move 439 2245 - 2245 25,24,16,35 move 231 2662 - 2662 30,22,13,3 move 15 2842 - 2887 


Columns labled 


"Initial MX YZ' 
these variables after the first 100 moves; i.e. 


contain differing values of 
contain different 
probabilistically-generated event matrices. With each initial event 


220 


matrix, described by a set (WX,Y,2), is associated the move number 
on which perpetual mutual co-operation commences [column labled 
"Perpetual (C,c)"] in the deterministic phase of the game (moves 101- 
1000) arising from that set. If no mutual co-operation occurs between 
moves 101-1000, the entry for that set reads "none". The column 
labled "Final Score" associates the score (after 1000 moves) which 
results from the given initial set (M.XYZ. 

The sets of values (W,X,Y,Z} are arranged in blocks. Within 
each block, the values of W and Z are held constant, while the 
difference between X and Y increases. Each column of blocks holds the 
value of W constant, while the value of Z increases from block to 
block. Similarly, each row of blocks holds the value of Z constant, 
while the value of .W increases from block to block. Thus table 11.1 
can be read both vertically and horizontally. 

Reading down a column shows the effect of increasing initial 
difference in anti-symmetric outcomes (within blocks), and of im 
creasing initial mutual defections (between blocks), upon the attain- 
ment of perpetual mutual co-operation and upon the game score. 
Reading across a row shows the effect of increasing initial mutual 
co-operation upon the attainment of perpetual mutual co-operation and 
upon the game score, with the number of initial mutual defections 
held constant and the variance in difference between anti-symmetric 
outcomes held to one. 

Recall that, for MEU versus MEU, the most probable (W,X,Y,Z is 
{25,25,25,25}. In table 11.1, the sets of initial event matrices are 
representative of the probabilistic fluctuations in these values that 
would naturally occur in empirical trials. Three main tendencies, and 
one interesting exception to them, are quickly made apparent by this 
table. 

First, within each block, the onset of perpetual mutual co- 
operation (when it occurs) is increasingly delayed by increases in 
the difference between X and Y. For a given number of mutual co- 
operations, a given number of mutual defections, and an initial 
unequal number of (C,d) and (D,c) outcomes, the MEU pair proceeds to 
equalize the number of (Cd) and (D,c) outcomes. Once that happens, 
their expected utilities become equal, and the pair defects until the 
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value of EUD is driven below that of UC. Perpetual mutual co-opera- 
tion then ensues. A tied final score is indicative of this process. 
The greater the initial difference between xX and Y, the greater 
number of moves are required for their equalization, and the still 
greater number of moves must be made before mutual co-operation is 
attained. Thus, for a given Wand Z, the smaller the initial dif- 
ference between X and Y, the larger the final score. 

Second, reading down the columns, one perceives that for a 
constant value of HW, the onset of perpetual mutual co-operation is 
actually hastened as the initial number of mutual defections in- 
creases. Within certain probabilistic limits, which vary according to 
their initial weightings, the maximization strategies demonstrate the 
capacity of enlisting mutual defections in the service of perpetual 
mutual co-operation. While one wishes to refrain from lapsing into 
trite moralization, this counter-intuitive capacity suggests that, in 
certain instances, the game-theoretic end may justify the game- 
theoretic means. 

Third, reading across the rows, one perceives that for a 
constant value of Z, the onset of perpetual mutual co-operation is 
hastened as the initial number of mutual co-operations increases. 
This tendency is not surprising, but re-assuring in terms of the 
integrity of the maximization strategy. 

In general, table 11.1 shows that perpetual mutual co-operation 
between MEU pairs, and thus their game scores, depend upon three 
factors. The scores tend to increase as W increases with Z fixed, as 
Z increases with W fixed, and as the difference between X and Y 
decreases with both Wand Z fixed. One can amalgamate the first two 
tendencies, and observe that the scores tend to increase as the sum 
of symmetric outcomes, W + Z, increases; or, equivalently, as the sum 
of anti-symmetric outcomes, X+ Y, decreases. This observation, 
however, leads to the aforementioned exception. 

The (30,X,Y,35) block boasts the largest Wand Z values in 
table 11.1, yet the results that stem from this block are not al- 
together consistent with the tendencies so uniformly prevalent in the 
rest of the table. To begin with, the onset of perpetual mutual co- 
operation is hastened (albeit only slightly) as the difference 
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between Xand Y increases, not decreases. And, as evidenced by the 
absence of tied final scores, the MEU pairs in this block attain 
perpetual mutual co-operation without having first equalized X and Y 
values, and without ever equalizing them. The scores themselves are 
the highest in the table, in keeping with this block's highest K+ Z 
sun. The significance of this unusual block will be brought to light 
in subsequent tables. 

Meanwhile, table 11.1 does account for the distribution of 
scores in graph 11.7. One can observe the contributions toward 
skewness, with a majority of scores occurring in the 2400-2700 point 
range, and none exceeding 2900 points. Contributions to the minor 
prominence in the 1100-1200 point range occur when the sum of W +Z 
falls below a certain threshold, making mutual co-operation unat- 
tainable within 1000 moves; or when the sum of W+ Z is theoretically 
sufficient for perpetual mutual co-operation, but the difference 
between X and Y is large enough to prevent its onset. These latter 
conditions prevail in the (20,X,Y,15) and (20,X,Y,20) blocks, respec- 
tively. 

Next, a similar table is considered for MAE versus MAE. Recall 
that the most probable (W.X,Y,Z} for MAE versus MAE is (52,20,20,8). 
Table 11.2 (overleaf) displays corresponding fluctuations about these 
most probable values, and the results to which they give rise. 

Reading down the first column of table 11.2, one observes that 
the two previous tendencies hold until the {40,X,Y,14} block; that 
is, the onset of perpetual mutual co-operation is hastened as the 
difference X- Y decreases within blocks, and as the sum X +Y 
decreases between blocks. The (40,23,23,14) matrix of the (40,X, Y,14} 
block also conforms to these tendencies. But the other matrices in 
that block yield results comparable to those of the (30,X,Y,35) block 
in table 11.1; that is, they give rise to perpetual mutual co-opera- 
tion without first equalizing X and Y values, and the onset of mutual 
co-operation is hastened slightly as the difference X- Y increases. 
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Event Matrices and Scores 


Table 11.2 — MAE versus MAE, Varying 


Initial Perpetual Final Initial Perpetual Final Initial Perpetual Final 

EZEZ (Ga Score HEFZ (Ga Score ELEZ (Ga Score 

40,29,29,2 move 227 2715 - 2715 50,24,24,2 move 169 2836 - 2836 60,19,19,2 move 140 2999 - 2899 
40,30,28,2 move 239 2694 - 2694 50,25,23,2 move 203 2754 - 2809 60,20,18,2 move 57 2622 - 2622 
40,31,27,2 move 52 271 - 2671 50,26,22,2 move 206 2742 - 2812 60,21,17,2 move 623 2059 - 2059 
40,32,26,2 move 265 2648 - 2648 50,27,21,2 move 209 2730 - 2815 60,22,16,2 move 668 1975 - 1975 
40,33,5,2 move 280 2621 - 2621 50,28,20,2 move 211 2720 - 2820 60,23,15,2 move 717 1883 - 1883 
40,28,27,5 move 216 2734 - 2734 50,23,22,5 move 158 2951 - 2056 60,18,17,5 move 623 2059 - 2059 
40,29,26,5 move 227 2715 - 2715 50,24,21,5 move 209 2730 - 2815 60,19,16,5 move 668 1975 - 1975 
40,30,25,5 move 239 2694 - 2694 50,25,20,5 move 211 2720 - 2820 60,20,15,5 move 717 1883 - 1883 
40,31,24,5 move 52 267 - 2671 50,26,19,5 move 210 2717 - 2827 60,21,14,5 move 263 250 - 2850 
40,32,23,5 move 265 2648 - 2648 50,27,18,5 move 212 2707 - 2832 60,22,13,5 move 859 1614 - 1614 
40,26,26,8 move 195 2770 - 2770 50,21,21,8 move 148 2869 - 2869 60,16,16,8 move 124 2922 - 2922 
40,27,25,8 move 205 2753 - 2753 50,22,20,8 move 211 2720 - 2820 60,17,15,8 move 717 1883 - 1883 
40,28,24,8 move 216 2734 - 2734 50,23,19,8 move 210 2717 - 2827 60,18,14,8 move 263 285 - 2850 
40,29,23,8 move 227 2715 - 2715 50,24,18,8 move 212 2707 - 2832 60,19,13,8 move 859 1614 - 1614 
40,30,25,8 move 239 2694 - 2694 50,25,17,8 move 214 2697 - 2837 60,20,12,8 move 265 2568 - 2868 
40,2,24,11 move 180 2793 - 2798 50,20,19,11 move 220 27 - 282) 60,15,14,11 move 263 2585 - 2050 
40,26,23,11 move 195 2770 - 2770 50,21,18,11 move 212 2707 - 2832 60,16,13,11 move 859 1614 - 1614 
40,27,22,11 move 205 2753 - 2753 50,22,17,11 move 214 2697 - 2837 60,17,12,11 move 265 2568 - 2868 
40,28,21,11 move 216 2734 - 2734 50,23,16,11 move 443 2357 - 2357 60,18,11,11 move 268 2555 - 2875 
40,29,20,11 move 227 2715 - 2715 50,24,15,11 move 213 2688 - 2853 60,19,10,11 none 1334 - 1359 
40,23,23,14 move 140 2819 - 2819 50,18,18,14 move 129 2899 - 2898 60,13,13,14 move 110 2941 - 2941 
40,24,22,14 move 166 2914 - 2824 50,19,17,14 move 214 2697 - 2837 60,14,12,14 move 265 2568 - 2868 
40,25,21,14 move 166 2809 - 2929 50,20,16,14 move 443 2357 - 2357 60,15,11,14 move 268 2555 - 2875 
40,26,20,14 move 165 2806 - 2836 50,21,15,14 move 213 2688 - 2853 60,16,10,14 none 1334 - 1359 
40,27,19,14 move 165 2801 - 2841 50,22,14,14 move 523 2209 - 2209 60,17,9,14 none 1327 - 1372 


Reading down the second column, one observes that this depar- 
ture from precedent tendency now becomes the norm itself. With the 
obvious exception of matrices in which X equals Y initially, the 
second column of blocks behaves as the last block in the first 
column. Note that, within each block except the first, the order of 
the onset of perpetual mutual co-operation is increasingly jumbled. 
The most important overall effect of this departure, exempli- 
fied in the first three blocks of column two, is reflected in the 
final scores. Because the X and Y values are not equalized prior to 


perpetual mutual co-operation, the gap between the final scores 
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increases as the initial difference between xX and Y increases. Owing 
to the vicissitudes of chance during the first hundred moves, one 
member of the MAE pair finds that joint occurrences of its co-opera- 
tion and its twin's defection outnumber joint occurrences of its 
defection and its twin's co-operation. In the (W,X,Y,Z) region under 
consideration, this member's final score decreases, while its twin's 
increases, as the initial difference X — Y becomes larger. 

Then, suddenly, in the (50,X,Y,11) block, a new phenomenon is 
manifest. Four of five sets in this block give rise to perpetual 
mutual co-operation between moves 210-214, with respective final 
scores within the 2688-2853 point range. But the {50,23,16,11} 
matrix, which contains neither the largest nor the smallest (X,Y) 
difference in the block, gives rise to an unexpectedly large number 
of mutual defections, with the onset of perpetual mutual co-operation 
delayed until move 443. The resultant final score, tied at 2357 
points, indicates that X and Y values are once again equalized during 
the game. 

This phenomenon is increasingly more frequent, and more dras- 
tic, through the balance of column two, and throughout column three. 
For instance, consider what takes place in the (60,X,Y,8) block. The 
first matrix, (60,16,16,8), gives rise to early perpetual mutual co- 
operation, commencing on move 124, and the MAE twins attain a cor- 
respondingly high score, tied at 2922 points. But the second matrix, 
{60,17,15,8}, leads to comparative disaster: perpetual mutual co- 
operation does not commence until move 717, and the pair attains a 
correspondingly low final score, tied at 1883 points. Hence, a small 
increment in the difference between X and Y produces a momentous 
delay in the onset of perpetual mutual co-operation, with a cor- 
respondingly large decrement in the final scores. 

The third matrix inthe block, (60,18,14,8), reverses the 
previous disaster. Perpetual mutual co-operation begins at move 263, 
which is not unreasonable in light of the initial (X,Y) difference. 
No equalization of (X,Y) values takes place, and the final scores are 
therefore fairly high but disparate, at 2585-2850 points. But the 
fourth matrix, {60,19,13,8}, leads to renewed disaster, with per- 


petual mutual co-operation commencing only on move 859, and a resul- 
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tant low tied score of 1614 points. 

The culmination of these alternating radical changes appears in 
the last two blocks of column three. The combination of a sufficient- 
ly large W+ Z sum and a sufficiently large X- Y difference can 
result in perpetual mutual defection from move 101 to the end of the 
game. In such cases, the MAE pair attains scores of less than 1400 
points. 

Evidently, the event matrix becomes increasingly unstable as 
the sum of symmetric outcomes begins to exceed that of anti-symmetric 
outcomes. The expected utilities associated with these outcomes begin 
to reverse their prescriptions with each increment of the (XY dif- 
ference, and the pendulum of joint outcomes swings steadily away from 
perpetual mutual co-operation, and toward perpetual mutual defection, 
as W+ Z grows and X + Y diminishes. 

Table 11.2 does account for the distribution of scores in graph 
11.8, albeit in an unexpected fashion. When random fluctuations about 
the most probable event matrix, (52,20,20,8), are relatively small, 
the scores attained are fairly high. Larger fluctuations which reduce 
the sum Wt Zdo not substantially reduce the final scores. But 
larger fluctuations which increase the sun W+ Z produce both the 
highest scores in the distribution (when X= Y), as well as the 
lowest scores (when X- Y is sufficiently large). 

Next, a Similar table is considered for MAC versus MAC. The 
process leading to the fragmented distribution of scores for 500 
games of MAC versus MAC, as displayed in graph 11.9, is well-depicted 
in table 11.3 (overleaf). Table 11.3 shows a continuation of the new 
tendency observed in table 11.2; namely, a transition to increasingly 
less stable event matrices. Recall that the most probable event 
matrix for MAC versus MAC is (81,9,9,1). This set of values evidently 
lies in a highly unstable region of the (W,X,Y,2Z} spectrum, in which 
probabilistic fluctuation gives rise to one of three situations. 
Together, the three situations account for the fragmentation of the 
MAC pair's distribution. 
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Table 11.3 — MAC versus MAC, Varying Event Matrices and Scores 


Initial Perpetual Final Initial Perpetual Final Initial Perpetual Final 
EZEZ (Ca Score ELEZ (Ca Score ELEZ (Ga Score 


79,10,10,1 move 115 2960 - 2960 81,9,9,1 move 113 2965 - 2965 83,8,8,1 move 111 2970 - 2970 


79,11,9,1 move 364 2352 - 2887 61,10,8,1 none 1330 - 1585 83,9,7,1 none 1318 - 1623 
79,12,8,1 none 1330 - 1565 81,11,7,1 none 1318 - 1603 83,10,6,1 none 1306 - 1641 
79,13,7,1 none 1318 - 1583 81,12,6,1 none 1306 - 1621 83,11,5,1 move 396 2256 - 2931 
79,14,6,1 none 1306 - 1601 81,13,5,1 move 385 2278 - 2933 83,12,4,1 none 1285 - 1680 
79,10,9,2 move 364 2352 - 2960 81,9,8,2 none 1330 - 1585 83,8,7,2 none 1318 - 1623 
79,11,8,2 none 1330 - 1565 01,10,7,2 none 1318 - 1603 83,9,6,2 none 1306 - 1641 
79,12,7,2 none 1318 - 1583 81,11,6,2 none 1306 - 1621 83,10,5,2 move 396 2256 - 2931 
79,13,6,2 none 1306 - 1601 81,12,5,2 move 385 2278 - 2933 83,11,4,2 none 1285 - 1680 
79,14,5,2 wove 375 2298 - 2933 81,13,4,2 none 1285 - 1660 83,12,3,2 none 1272 - 1702 
79,9,9,3 move 111 2965 - 2965 91,8,8,3 move 109 2970 - 2970 83,7,7,3 move 107 297 - 2975 
79,10,8,3 none 1330 - 1565 81,9,7,3 none 1318 - 1603 83,8,6,3 none 1306 - 1641 
79,11,7,3 none 1318 - 1583 81,10,6,3 none 1306 - 1621 83,9,5,3 move 396 2256 - 2931 
79,12,6,3 none 1306 - 1601 81,11,5,3 move 385 2278 - 2933 83,10,4,3 none 1285 - 1680 
79,13,5,3 move 375 2299 - 2933 81,12,4,3 none 1285 - 1660 83,11,3,3 none 1272 - 1702 
79,8,8,5 move 107 2970 - 2970 01,7,75 move 105 2975 - 2975 83,6,6,5 move 104 2978 - 2978 
79,9,7,5 none 1318 - 1583 81,8,6,5 none 1306 - 1621 83,7,5,5 aove 3% 2256 - 2931 
79,10,6,5 none 1306 - 1601 81,9,5,5 move 385 2278 - 2933 03,8,4,5 none 1285 - 1680 
79,11,5,5 move 375 2299 - 2933 81,10,4,5 none 1285 - 1660 83,9,3,5 none 1272 - 1702 
79,12,4,5 none 1285 - 1640 81,11,3,5 none 1272 - 1682 83,10,2,5 none 1259 - 1724 
79,7,6,8 none 1306 - 1601 91,6,5,8 move 101 2976 - 2981 93,5,4,8 move 101 2977 - 2982 
79,8,5,8 move 375 2298 - 2933 81,7,4,8 none 1285 - 1660 83,6,3,8 move 101 2972 - 2987 
79,9,4,8 none 1285 - 1640 91,8,3,8 none 1272 - 1682 83,7,2,8 move 101 2967 - 2992 
9,10,3,8 none 1272 - 1662 81,9,2,8 none 1259 - 1704 83,8,1,8 nome 1245 - 1750 
79,11,2,8 none 1259 - 1684 81,10,1,8 none 1245 - 1730 83,9,0,8 none 1231 - 1776 


First, perpetual mutual co-operation can be attained very 
rapidly (as on move 115 in the (79,X,Y,1) block), or even immediately 
(as on move 101 in the (83,X,Y,8) block). The onset of rapid per- 
petual mutual co-operation, when it occurs, is hastened as the sum W 
+ Z increases. And when it does occur, it results in very high 
(though not necessarily equal) scores for both twins, in the 2960- 
2992 point range. This situation contributes to the prominence at the 
high end of the scale in graph 11.9. 

Second, the onset of perpetual mutual co-operation can be 
noticeably retarded, occurring anywhere between move 364 and move 396 
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in table 11.3. The delay increases with the sun of W+ Z. And the 
delay, when it occurs, marks a disparity in the final scores. One 
pair-member attains roughly 2800-2950 points; the other, roughly 
2200-2500 points. This situation thus contributes to the high-range 
prominence, and it forms the prominence in the next-lowest point 
range in graph 11.9. The trough from 2600-2800 points occurs, self- 
evidently, because no probabilistic event matrix in this region of 
the {W,X, Y, Z} spectrum can give rise to a deterministic score in that 
range. 

Third, there may be no onset of perpetual mutual co-operation. 
Such cases give rise to disparate, low final scores. The range of the 
disparity varies roughly from 250 points to 550 points. This range 
increases, between blocks, with the sum W + Z; and it increases, 
within blocks, with the difference X- Y. A typical score is 1621- 
1306 points. This situation contributes to the two other prominences, 
in the 1500-1700 and 1300 point ranges of graph 11.9. Again, troughs 
occur in the 1900-2200 and 1000-1200 point ranges because such scores 
are deterministically inaccessible from the event matrices in this 
probabilistic region of the (W,X,Y,Z) spectrum. 

These three respective situations occur consecutively in the 
(83,X, Y,5) block of table 11.3. The instability of the event matrix 
is well evidenced in this block. The matrix (83,6,6,5) gives rise to 
perpetual mutual co-operation on move 104, and results ina final 
score tied at 2978 points. When the (X.Y) values fluctuate from (6,6) 
to (7,5), perpetual mutual co-operation does not begin until move 
396, with a resultant score of 2256-2931. One further fluctuation in 
(X,Y) values, from (7,5) to (8,4), debars further perpetual mutual 
co-operation from this block, and results in scores such as 1285- 
1680. Thus, in this block, an initial (X,Y) difference of only 4 
causes severe decrements, of 1693 and 1298 points, to the final 
scores of the MAC twins. 

In sum, tables 11.1, 11.2 and 11.3 account for the different 
non-normal distributions of final scores in repeated encounters 
between MEU-MEU, MAE-MAE and MAC-MAC pairs. Moreover, these tables 
reveal some unexpected, interesting and shifting tendencies across 


the spectrum of possible event matrices. These tendencies convey an 
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understanding of the general nature of the relationship between the 
probabilistic and deterministic phases of the maximization family's 
play. 

This understanding extends to cases in which siblings, rather 
than twins, are paired. One need not resort to further tedious 
analyses of numerous representative probabilistic fluctuations, but 
one might outline just one case to illustrate how the understanding 
can be applied. One hundred games of MAC versus MAE give the dis- 
tributions of final scores displayed in graph 11.10. 

The most probable event matrix for MAC versus MAE is 
{64,26,7,3}, which gives rise to perpetual mutual co-operation on 
move 295, and thence to the most probable score of MAC 2473, MAE 
2913. But the average score for one hundred games is found to be MAC 
1849, MAE 2123. Again, the distributions explain the discrepancy. But 
what gives rise to the distributions? 


Graph 11.10 MAE versus MAC 
Histogram of Scores for 100 Games 
Average Score: MAE 2123, MAC 1849 


Frequency 
5 


11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Range of Scores x 1/100 


-A~ MAC's Scores —*- MAE's Scores 
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In the initial event matrix, let W and Zbe held constant at 
their most probable respective values of 64 and 3, and let (XY) 
fluctuate from (25,8) to (29,4). Then the following results obtain: 


Table 11.4 E MAC versus MAE, Varying Event Matrices and Scores 


Initial (W,X,Y,2) Perpetual (C,c) Final Score 
64,25,8,3 none 1320 - 1425 
64, 26,7,3 move 295 2473 — 2913 
64,27,6,3 move 299 2457 — 2922 
64,28,5,3 move 302 2443 - 2933 
64,29,4,3 none l 1280 — 1495 


Table 11.4 shows how the distributions in graph 11.10 arise. 
The probabilistic event matrices for MAC versus MAE lie in an um 
` Stable region of the (W,X,Y,2Z) spectrum, from which two main deter— 
ministic states are accessible. Perpetual mutual co-operation either 
commences around move 300, or it does not commence at all. The first 
state contributes to the higher point-range features in the respec- 
tive distributions; the second, to the lower. In the first situation, 
MAE defeats MAC by a typical score of 2900-2500; in the second 
Situation, by a typical score of 1600-1350. 

Similar outlines could naturally be drawn to account for the 
results of other encounters between maximization family siblings. But 
the foregoing analyses convey an appreciation of the reason for MAC's 
relatively poor performances against its twin and its siblings, as 
displayed in table 10.2. MAC's initially high co-operative weighting, 
which stands MAC in better stead than its siblings in competition 
against other strategic families, militates against MAC in intra- 
familial competition. MAC's probabilistic event matrices span an 
unstable region of the (W,X,Y,Z) spectrum, and the instability causes 
moderate to extreme discrepancies between MAC's most probable and 
average scores. 

MAC's less co-operatively weighted siblings, MAE and MEU, are 
also afflicted by this intra-familial syndrome, but to corresponding— 
ly lesser extents. MAD is immune to it; hence MAD's most probable and 


average scores coincide. But MAD's immunity is conferred by a pro- 
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perty which entails far worse consequences in the interactive en- 
vironment; namely, the inability to cross the threshold of perpetual 
mutual co-operation. Hence, MAD's prophylactic measure is more 
debilitating than the syndrome which it prevents. 

Given the broad range of possible final scores of the more co- 
operative maximization strategies’ intra-familial play, it seems 
justifiable to have recorded their average scores against one another 
in repeated encounters in the main tournament table, as opposed to 
scores resulting from single encounters. Since the scores between 
other strategies (as well as between maximization and other strate- 
gies) have either pre-determined, deterministic, or normally-dis- 
tributed values, such scores can be more confidently recorded from 
single enounters. 

It must be fairly observed that the main tournament standings 
could be affected by replacing the existing intra-familial maximiza- 
tion strategies' scores, which are averaged for five hundred en- 
counters between twins and one hundred encounters between siblings, 
with scores resulting from single encounters. Specifically, if MAE 
were to realize its lowest probabilistic scores in all intra-familial 
pairings, it would slip to third place, behind SHU, inde main 
tournament standings. Similarly, MEU would slip from seventh to 
eighth, and possibly to ninth place in the standings. Then again, if 
MEU were to realize its highest probabilistic scores in all intra- 
familial pairings, it might climb past ETH in the standings. Any such 
changes could affect the order of overall robustness. 

But significantly, the combination of a propitious intra- 
familial showing by MAE, and a poor one by MAC, would still not allow 
MAE to overtake MAC in the standings. And because MAC would thus 
maintain its hold on first place, one can hypothesize that MAC would 
continue to remain most robust overall in the interactive environ- 
ment. 

In sum, this averaging procedure yields a main tournament order 
which is not necessarily absolute, but which seems to be fair. And 
now that a deeper understanding of the performances of the maximiza- 
tion strategies has been reached, one can proceed to the summary and 
conclusions of this enquiry. 
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| Chapter Twelve 
Summary and Conclusions 


Before conclusions are drawn from this study of strategic 
interaction in the Prisoner's Dilemma, its perspective and principal 
findings are summarized. Parts One and Two of the enquiry are theore- 
tical in nature; Parts Three and Four, experimental. At an appropri- 
ate stage, one must ask whether some form of continuity obtains 
between theory and experiment. 

Part One outlines a game-theoretic background against which the 
main problem of the enquiry is configured. 

Chapter One reviews fundamental taxonomic criteria of the 
theory of games, and introduces some received terminology in the 
process. One of the main strengths of the theory lies in its ability 
-to classify games, but the intent of such classification is not 
restricted to the establishment of a reliable taxonomy. Because game 
theory embraces an awareness of its own limitations, the very act of 
classifying a given game is also a means of determining whether the 
theory is prescriptive, or merely descriptive, of the actual play. 
For those (relatively few) classes of games over which the theory 
holds normative sway; i.e., up to and including two-person, zero-sum 
games whose matrices contain saddle-points, the theory can prescribe 
the "best" moves according to minimax and maximin criteria. For those 
(relatively many) classes of games which are neither zero-sum nor 
strictly determined, the theory can still prove useful in a descrip- 
tive capacity. 

Chapter Two discusses one of the main weaknesses of game 
theory; namely, its necessary but problematic incorporation of 
utility theory. With the enlistment of the utility function, game 
theory inherits a wealth of problems. These range from conflicting 
interpretations in the philosophy of probability to practical dif- 
ficulties latent in probabilistic calculi, and from value-ordering 
scales in the intra-personal comparison of utilities to lack of 
value-equivalence in the inter-personal comparison of utilities. The 
utile is posited as a unit of utility, and is assumed to be a pure, 
conserved quantity. The utile enables game—theoretic modelling of 
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qualitative situations of risk and conflict of interest by mapping 
players' preferences to the real numbers. The utile is thus an 
indispensable but also largely hypothetical unit of measure. 

Chapter Three discusses a second area of game-theoretic conten- 
tion; namely, the vexed question of rationality. It appears that no 
definition of game-theoretic rationality has been articulated that is 
universally acceptable or satisfactory. Examples are cited to support 
the argument that the rationality or irrationality of a player cannot 
be reliably assessed from his play alone. Å move that appears tacti- 
cally disadvantageous in an isolated context, such as a bluff in 
poker or the sacrifice of a piece in chess, can be quite sound froma 
strategic point of view. Depending on the nature of the game, a 
losing tactic can form part of a winning strategy. Similarly, a game 
itself can be lost in order that an associated meta-game be won. 

This enquiry posits a criterion of rationality (and irrational- 
ity) which depends not upon winning (or losing) per se; rather, in 
maintaining consistency between one's preference ina game and such 
play as conduces to the realization of that preference. If a player 
prefers to win a game, he is said to be rational if he plays accord- 
ing to the best of his ability to win. By the same token, if a player 
prefers to lose a game, he is said to be rational if he plays accord- 
ing to the best of his ability to lose. This criterion is quintessen- 
tially game-theoretic in character, for it adheres to Rapoport's 
precept of appealing to the logical, as opposed to the psychological, 
aspects of play.) One does not enquire into the motives underlying a 
player's preference; one merely seeks consistency between that 
preference and the principle or strategy which the player chooses to 
implement. 

Part Two examines the Prisoner's Dilemma in the static mode, 
with the object of exposing several levels at which the dilemma 
persists, despite ingenious attempts to resolve it. As a two-person, 
non-zero-sum, non-co-operative game, the Prisoner's Dilemma has no 
infallible, prescriptive resolution. 


| Rapoport, 1966, p.103. 
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Chapter Four presents the paradigmatic case of the Prisoner's 
Dilemma, and shows how the dilemma can be represented as a fundamen- 
tal conflict between two principles of choice: dominance versus 
maximization of expected utility. The dominance principle dictates 
that prisoner A fares better by defecting than by co-operating, 
regardless of what prisoner B elects to do. But if both prisoners 
adopt the dominance principle, they attain a mutually-detrimental 
outcome. Maximization of expected utility prescribes unequivocal co- 
operation in the event of complete probabilistic dependence; i.e., 
when the probability of a joint similar outcome, either (C,c) or 
(D,d), is unity. In the event of partial probabilistic dependence, 
the maximization principle prescribes either co-operation or defec- 
tion, depending upon the relative values of the probability of a 
joint similar outcome and the particular payoff structure of the 
game. If both players co-operate, they attain a mutually-beneficial 
outcome. 

This fundamental conflict of principle can be viewed as a 
conflict of rationality. In Rapoport’s terms, it is individually 
rational to defect, but collectively rational to co-operate.’ If two 
individual rationalists are caught in the dilemma, they both defect 
and reap detrimental desserts. If two collective rationalists are 
caught in the dilemma, they both co-operate and thereby extricate 
themselves. But if one prisoner defects while the other co-operates, 
the defector receives the largest reward while the co-operator 
sustains the worst punishment. How then can a collective rationalist 
avoid being exploited by an individual rationalist? This enquiry 
suggests that he can do so by adopting the principle of the maximiza- 
tion of expected utility, and taking as p(c/C) the probability that 
the other prisoner is collectively rational. 

Naturally, the suggestion is highly theoretical, and difficult 
to implement. Were it not so, the Prisoner's Dilemma would be resol- 
ved. In the static mode, this suggestion merely transposes the 
problem from one kind of dilemma to another. To wit: neither prisoner 
has at his disposal an objective method for ascertaining the probabi- 


2 Ibid, p.146. 
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lity that the other is collectively rational. Nonetheless, the 
persistence of the dilemma does not detract from the acuteness of 
Rapoport's distinction between individual and collective rationality. 

Chapter Five examines another attempted resolution of the 
dilemma, this time in decision-theoretic terms. The resolution 
proceeds from a reformulation of Newcomb's Paradox. Newcomb's Paradox 
is a game against a state of nature, in which the player faces 
divergent dictates of the same two principles of choice encountered 
in the Prisoner's Dilemma: dominance versus maximization of expected 
utility. The reformulation of the paradox effectively eliminates the 
dominance principle from contention. Owing to the insights of Brams 
and Lewis, it is possible to view the Prisoner's Dilemma as a dual 
Newcomb's Paradox.) When the Prisoner's Dilemma is reformulated ina 
Similar fashion, the ' dominance principle is again eliminated. This 
leaves maximization of expected utility as the remaining decision- 
theoretic principle, with mutual co-operation as a possible outcome. 

As a result of this reformulation, a given prisoner's delibera- 
tion shifts from "What will the other prisoner choose?” to "Will the 
other prisoner correctly predict what I choose?" If the answer to the 
second question is in the affirmative, then the given prisoner co- 
operates; if in the negative, then he defects. But, notwithstanding 
this reformulation, the dilemma persists. It does so because, once 
again, a prisoner has no objective means by which to answer the new 
question reliably. Hence, even two collectively rational prisoners 
might still both defect, owing to mutual errors in judgement. 

Chapter Six examines Howard's meta-game resolution of the 
Prisoner's Dilemma. A second-level meta-game of conditional strate— 
gies generates a matrix in which defection is neither strongly nor 
weakly dominant. But another kind of dominance emerges from the 
individually rational prisoner's consideration of the meta-strategic 
possibilities involved; namely, set-theoretic dominance. In the meta- 
game situation, set-theoretic dominance converges with the maximiza— 
tion of expected utility in prescribing co-operation. Thus, the meta- 
game formulation of the Prisoner's Dilemma marks a reconciliation 


; Respectively: Brams, 1975; and Lewis, 1979. 
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between individual and collective rationality. 

Nonetheless, the dilemma is still not resolved ina legislative 
sense. Even if both prisoners were made aware of the existence of the 
meta-game resolution, they would not be compelled to implement it. 
Indeed, such awareness could actually lead to mutual defection, if 
each prisoner sought to take advantage of the other's meta-game 
justification for co-operation. 

Part Three examines the Prisoner's Dilemma in the iterated 
mode, by means of a simulated tournament of interacting strategies. 
Twenty different strategies compete against one another (and their 
twins) in games of one thousand moves in length. The experiment asks 
two main questions: which strategy (or strategies) are most robust in 
the given environment, and why? 

Chapter Seven summarizes the salient results of Axelrod's two 
previous tournaments, and shows how the interactive tournament is 
intended to complement his experiments. These complemetary tourna- 
ments differ primarily in terms of the constitution of their strate- 
gic populations. Axelrod draws upon a population of "wild" strate- 
gies, by soliciting unrestricted contributions from diverse sources. 
The interactive tournament draws upon a population of "captive" and 
"domesticated" strategies, both by "capturing" interesting wild types 
and by selectively "breeding" a range of experimental traits. 

The interactive population is heuristically grouped into five 
"families" of strategies: the probabilistic family, the tit-for-tat 
family, the maximization family, the optimization family, ard the 
hybrid family. Each family's members are related either closely, by 
program structure, or more distantly, by conceptual function. 

The interactive tournament's controlled environment facilitates 
comparison of the effectiveness of closely-related strategies, whose 
programs differ by the value of a single parameter. This type of 
control, where applicable, thus enables a parametric assessment of 
performance. It applies herein to the probabilistic family and to the 
maximization family. The relative effectiveness of these two groups 
of strategies, both intra-familially and in the overall environment, 
is of particular relevance to this enquiry, since these families 


contain the strategic equivalents of the divergent principles of 
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choice encountered in the static mode. It is desirable to know how 
these principles fare as dynamic strategies, in the iterated mode. 

Chapter Eight examines the scores of the main tournament. The 
winner is MAC, the most co-operatively weighted member of the maxi- 
mization family. An analysis of all possible sub-tournaments, formed 
by exhausting all combinations of the twenty strategies in groups of 
two to twenty competitors, again shows MAC to be the most efficient 
strategy in the environment. The measure of a strategy's efficiency 
is taken to be the ratio of the number of strategies it betters to 
the number of strategies it encounters. 

Å strategy which is successful in a number of different situa- 
tions is said to be robust. As MAC is the most efficient strategy in 
the greatest number of sub-tournaments, MAC is the most robust 
strategy with respect to combinatoric criteria. 

Chapter Nine generates a series of ecosystemic competitions, 
based on Axelrod's ecological scenario.’ In the ecological scenario, 
the ratio of two strategies’ scores is assumed to represent the ratio 
of their populations in direct competition against one another. The 
ratio of their offspring in the next generation is assumed to be 
proportional to the ratio of their directly competing populations in 
the previous generation, and to their respective fractional represen- 
tations in the overall population. 

After a certain number of generations, the rate of population 
change becomes negligibly small for all competing strategies. One or 
more strategies may become "extinct", while each survivor attains a 
stable population level in his particular niche. After stability is 
attained, the scores of the first strategy to have become extinct are 
withdrawn from the pool, and a new ecosystemic competition is genera- 
ted among the survivors. The process is repeated until no further 
extinctions take place. 

Robustness in the ecological scenario is evaluated according to 
four parameters: longevity, average fecundity, stable efficiency, and 
adaptivity. Longevity is the total number of generations survived by 


a strategy's progeny, over all ecosystemic competitions. Average 


1 Axelrod, 1980b, pp.398-401. 
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fecundity is the net increase (or decrease) in a strategy's progeny, 
divided by its longevity. Stable efficiency is a strategy's overall 
efficiency at stability, in all ecosystemic competitions in which it 
figures. Adaptivity is the average fraction of competitors that a 
strategy overtakes (in transition from initial to stable population 
frequency), per ecosystemic competition in which it figures. Based on 
this four-parameter approach, MAC is the most robust strategy in the 
ecological scenario. v 

Chapters Eight and Nine together answer the first experimental 
question: MAC is the most robust strategy in the interactive environ- 
ment, according to the criteria employed. While these criteria are 
neither unique nor absolute, they are combinatorically exhaustive and 
ecologically variegated, and arguably appropriate and fair. 

Part Four, in its first two chapters, seeks to answer the 
second experimental question: why is MAC most robust in the given 
environment? To find an answer, a somewhat detailed examination is 
made of the maximization family members' performances, against other 
strategies and against their own siblings and twins. 

Chapter Ten accounts for MAC's robustness in light of MAC's 
fulfilment of Axelrod's criterion of success: the ability to exploit 
the exploitable strategies without paying too high a price against 
the others.” Move-by-move analyses of the maximization family's games 
against representative exploitable and non-exploitable strategies 
reveal two important facets of their play. First, the maximization 
family members become slightly more exploitive as their weightings 
become pronouncedly less co-operative. Second, the onset of perpetual 
mutual co-operation with non-exploitable strategies is increasingly 
retarded as the maximization strategies' weightings become less co- 
operative. 

The drastic differences between final scores resulting from 
early perpetual mutual co-operation with non-exploitable strategies, 
and final scores resulting from late or no perpetual mutual co- 
operation with such strategies, far overshadow the small differences 
between final scores against exploitable strategies, which result 


> Ibid, p.403. 
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from large graduations in co-operative weighting. In other words, 
while MAC and MAE are only slightly less able to exploit the ex- 
ploitable than are MEU and MAD, MAC and MAE fare far better against 
the non-exploitable than do MEU and MAD. 

These two performance characteristics are sufficient to account 
for the maximization family's robustness, which increases as a 
function of its members' co-operative weighting. However, it trans- 
pires that the maximization strategies’ average intra-familial scores 
are appreciably lower than their average inter-familial scores. 

Chapter Eleven examines the performances of the maximization 
strategies against their siblings and twins, in order to ascertain 
the causes of their relatively poor intra-familial scores. The 
examination reveals that, with the exception of MAD versus MAD, large 
discrepancies exist between the most probable scores and the average 
scores attained by twins. This is indicative of non-normal distribu- 
tions. The reasons for the non-normalcy naturally lie in the initial, 
probabilistic event matrices, of which the final scores are deter- 
ministic end-products. 

The investigation then turns to the event matrix, whose intri- 
cate properties undergo marked transitions as the distribution of 
outcomes within the matrix changes. Four distinct types of deter- 
ministic interaction are discernible from all possible (W,xX, Y,2} 
outcomes, where W is the number of instances of (C,c); X, of (Cd; 
Y, of (D,c); Z, of (D,d) during the initial one hundred moves. 

First, below a certain threshold value of VW, no perpetual 
mutual co-operation can occur between moves 101-1000, regardless of 
the (X,Y,Z) values. MAD's probabilistic fluctuations about its most 
probable event matrix, {1,9,9,81}, lie well below this threshold. In 
consequence, MAD's scores against its twin are normally distributed, 
and so MAD's average score tends toward its most probable score as 
the number of games increases. But the scores in this region are very 
low. 

Second, above the threshold value of W, perpetual mutual co- 
operation can occur when the sum W+ Z is sufficiently large. Empiri- 
cally, it is found that (W,2) values of (20,15) do not give rise to 
perpetual mutual co-operation, whereas values of (20,20) do so. The 
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overall trend in this region is that the onset of perpetual mutual 
co-operation is hastened both as the sum W+ Z increases, and as the 
difference X - Y decreases. 

Third, as the the sum W+ Z becomes too large (more than 60, 
empirically), the matrix becomes increasingly less stable, giving 
rise to oscillating final scores. In this region, perpetual mutual 
co-operation can occur very shortly after the one hundredth move, or 
can occur after a delay of several hundred moves, or cannot occur at 
all, depending upon particular values of (W,X,Y,2). The oscillations 
exhibit some periodicity. 

Fourth, above upper threshold values of (W,Z) [empirically, 
(83,8)], perpetual mutual co-operation either commences immediately 
on move 101 (if the difference between X and Y is small), or else 
never commences at all. At W values of 84 and higher, immediate 
perpetual mutual co-operation occurs only when X= Y. Otherwise, 
perpetual mutual defection occurs after move 100. 

All MAD-MAD interactions take place in the first region. The 
MEU-MEU pair straddles the first and second regions; the MAE-MAE 
pair, the second and third regions; the MAC-MAC pair, the third and 
fourth regions. The twins' respective distributions of scores are 
explained by the respective spectra of scores accessible to them via 
probabilistic fluctuations of the event matrix in the regions in 
which their interactions take place. The non-normal distributions of 
scores between siblings are similarly explicable. 

This completes the summary of the enquiry's perspective and 
findings. Next, an attempt is made to draw pertinent conclusions, 
with some attention paid toa comparison between the static and 
iterated modes of the Prisoner's Dilemma. The conclusions are grouped 
into five sets, which pertain to the following topics: first, an 
articulation of the relation between probabilism and collective 
rationality: second, a comparison between results of Axelrod's 
tournaments and those of the interactive tournament; third, observa- 
tions about the strategic population of the interactive tournament 
itself; fourth, matters arising from the performance of the maximiza- 
tion family of strategies; and fifth, general remarks concerning the 
game-theoretic approach to conflict research. 
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That portion of the enquiry devoted to game-theoretic back- 
ground discusses two principal areas in which the theory's efficacy 
is disputed: utility theory and rationality. The Prisoner's Dilemma 
is an interesting conflict model partly because it links these two 
areas, and inso doing gives rise to numerous complications that 
engender mathematical, philosophical, and social scientific enquiry. 
The implementation of the utile, the hypothetical pure unit of 
utility, allows the model to side-step the difficulties associated 
with intra-personal and inter-personal comparison of utilities. But 
the model does not seek to avoid the complexities associated with the 
probabilistic aspect of utility theory; rather, it creates a direct 
and intimate relation between probabilism and collective rationality. 

A first set of conclusions pertains to the articulation of that 
relation, and to the way in which it is expressed in different modes. 

In the static mode, the dilemma is theoretically resolved by 
eliminating strong and weak dominance as contending principles of 
choice. This invariably leaves the maximization of expected utility 
as the remaining principle of choice; not necessarily by preference, 
but certainly by default. The dilemma is then re-cast in light of the 
probabilistic component of the calculus of expected utility. In order 
to apply the maximization principle, a given prisoner is obliged to 
ask questions such as "What degree of probabilistic dependence, if 
any, obtains between both prisoners’ choices?", or "What is the 
probability that the other prisoner is collectively rational?", or 
"With what probability will the other prisoner correctly predict my 
choice?". The dilemma persists because these questions have no 
definitive answer. 

While attempted resolutions of the dilemma eliminate uncondi- 
tional defection as a collectively rational choice, they do not 
prescribe unconditional co-operation to the collectively rational 
prisoner. The maximization of expected utility allows the possibility 
of defection, in order that the collectively rational prisoner 
protect himself, as best he can, against an individually rational or 
irrational fellow-prisoner. The given prisoner's assessment of the 
other's rationality is reflected in the given prisoner's assignment 
of probability values. Thus probabilism and collective rationality 
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are inextricably linked. But in the static mode, the nature of the 
dilemma imposes an a priori probabilistic calculus, with all its 
attendant difficulties, upon the collectively rational prisoner. 

In the iterated mode, a principle of choice becomes a strategy, 
which generates a sequence of choices. The iterated dominance prin- 
ciple is the strategy of pure defection. The soundness of not adopt- 
ing pure defection, and of maximizing expected utility, is reflected 
in the results of the interactive tournament. In the iterated mode, 
it becomes possible to make use of an a posteriori, or frequentist 
interpretation of probabilism. This permits an objective assessment 
of the probability component in the calculation of expected utili- 
ties, which in turn reveals the capacity of the co-operatively 
weighted maximization strategy to perform effectively against a 
variety of other strategies. 

But a novel weakness becomes apparent in the iterated mode. The 
degree of co-operative weighting necessary to extract optimum inter- 
familial performance from the maximization algorithm ironically 
condemns the maximization family members to relatively poor intra- 
familial performances. The maximization strategies, as formulated in 
this study, are unable to achieve consistent mutual co-operation with 
one another primarily because they do not recognize one another in 
competition. 

Thus one perceives a dual continuity between the static and 
iterated modes. In both modes, maximizing expected utility is favour— 
ed as a decision rule. This is an encouraging continuity. But in both 
modes, the rule admits of weaknesses. These weaknesses are probabi- 
listic complements. In the static mode, the maximization principle 
cannot unerringly identify its twin, because of the subjective nature 
of a priori probabilism. In the iterated mode, the maximization rule 
cannot consistently recognize its twin, in spite of the objective 
nature of a posteriori probabilism. When it confronts itself in 
either mode, the maximization rule is hoist with its own probabilis- 
tic petard. This is a discouraging continuity. 

A second set of conclusions pertains to a comparison of salient 
results of Axelrod's tournaments with those of the interactive 
tournament. 
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Overall, the results of the interactive tournament corroborate 
Axelrod's most general conclusions from both of his tournaments. 
Respectively, these are: 


"The effectiveness of a particular strategy depends not 
only on its own characteristics, but also on the nature 
of the other strategies with which it must interact", 


"There is np best rule [strategy] independent of the 
environment." 


Although MAC proved to be the most robust strategy in the interactive 
environment (at least according to the criteria of robustness adopted 
herein), MAC is by no means the "best" strategy for all iterated 
Prisoner's Dilemmas. 

Without much difficulty, one can create environments in which 
MAC is not the most successful strategy. As seen in Chapter Eleven, 
one such environment consists of the maximization family members 
themselves: MAC, MAE, MEU and MAD. In the environment of these family 
members, MAC ranks third among four in points scored. And although 
the criteria of combinatoric and ecological robusteness have not been 
applied to this group in isolation, one can, with an eye on their raw 
scores, speculate that MAC would not prove most robust in these 
familial confines. 

The interactive tournament corroborates most of Axelrod's 
findings with respect to particular properties of successful strate- 
gies. TFT's combination of niceness, provocability and forgiveness 
stood it in most robust stead in Axelrod's two tournaments .° 
ly, MAC is both provocable and forgiving (except against its twin, 


Simi lar- 


when the instability of the event matrix can pre-empt its capacity to 
forgive). But MAC is neither nice nor rude; rather, nide. (Recall, a 
nice strategy is never the first to defect; a rude strategy, always 
the first to defect; a nide strategy, indeterminate with respect to 
primacy of defection.) The property of nideness is not absolutely 


6 Idem., 19804, p.21. 


7 Idem., 1980b, p.402. 
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preferable to that of niceness; it merely supersedes niceness in 
certain environments. | 

Of the five most robust strategies overall in the interactive 
tournament, MAC.and MAE are nide, while SHU, ETH and FRI are nice. 
All these strategies are provocable. But SHU becomes incrementally 
less forgiving following each provocation, and FRI is not forgiving 
at all. That the quality of mercy can be strained and lacking in two 
fairly successful strategies is perhaps indicative of the harshness 
of the interactive environment. 

The results of the interactive tournament also corroborate two 
of Axelrod's corollary conclusions. 

As previously mentioned, after his first tournament Axelrod 
concludes that if the, strategy submitted by Downing had been given a 
higher initial co-operative weighting, then that strategy would have 
won the tournament, "and won by a large margin." Downing's strategy 
was none other than the maximization of expected utility, with the 
same weighting as MEU. The increasingly co-operative weightings of 
MAE and MAC, and their respective performances in the interactive 
tournament, support Axelrod's finding. 

And as previously discussed, Axelrod's criterion of hypotheti- 
cal success in his second tournament is the ability to exploit the 
exploitable strategies without paying too high a price against the 
non-exploitable strategies." This criterion is not satisfied by any 
of the sixty-three competitors in that tournament, but it is satis- 
fied by MAC (and, to a lesser extent, by MAE). Axelrod's criterion 
thus applies fully to the interactive tournament. This applicability 
does not confer the authority to declare that MAC would have won 
Axelrod's second tournament, but it would be interesting to re- 
conduct that tournament with MAC and MAE in the competing popula- 


tion. this enquiry predicts that MAC would win that tournament as 
well. 


? Idem., 1980a, p.20. 
10 Idem., 1980b, p.403. 


E MAC's and MAE's algorithms would have to be modified to 
accommodate Axelrod's tournament games, which consist of fewer moves. 
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A third set of conclusions pertains to certain findings within 
the interactive environment itself. Since the interactive population 
is a regulated population, one might ask what conclusions can be 
drawn from the particular ways in which this experiment is regulated. 
Let the competing strategies be considered relative to their respec- 
tive family members, and relative to their respective families’ 
performances. 

The probabilistic family fares least well overall. It is clear 
that neither the pure strategies (DDD and CCC) nor the mixed random 
strategies (MD, RAN and TCO are particularly viable in a population 
containing more sophisticated decision rules. CCC is the least robust 
member of this family, and the least robust strategy in the tourna- 
ment, because it is utterly non-provocable and therefore thoroughly 
exploitable. The most successful members of this family, DDD and TD, 
are also the least co-operative. DDD is radically exploitive, but is 
also radically provocative to strategies that can retaliate. MD is 
highly exploitive, and is just co-operative enough to gull the less- 
provocable segment of the population. But DDD and TMD rank only 
thirteenth in overall robustness. 

The tit-for-tat family spans an interesting range of perfor- 
mance. The robustness of its three most successful members, SHU, TFT 
and TTT, increases as their provocability increases and as their 
forgiveness decreases. It is noteworthy that their order of finish in 
this tournament is precisely the reverse of that in Axelrod's first 
tournament, which TTT would have won (had it been submitted) and in 
which TFT out-ranks SHU. Once again, this reversal seems to indicate 
that the environment of Axelrod's first tournament is less harsh than 
that of the interactive tournament. Ina friendlier environment, one 
would naturally expect decreasing provocability and increasing 
forgiveness to conduce to increasing success. 

The other two members of this family, TAT and BABE, do not fare 
well in the interactive environment. TAT's contrariness (recall, TAT 
is the binary complement of TFT) allows it to exploit the least 
provocable strategies. For example, TAT defeats CCC by 5000 to O, and 
NYD by 4984 to 14. But the same contrariness results in TAT itself 
being heavily exploited by other exploitive or unforgiving strate- 
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gies. For example, DDD defeats TAT by 4996 to 1, while FRI defeats 
TAT by 4991 to 6. Thus TAT's contrariness can lead to extremes, but 
does not conduce to overall success. 

And note the difference between TFT's and HABE"s performances. 
IFT is fifth most robust overall; BBE, fifteenth. Yet BBE is identi- 
cal to TFT, save that it defects randomly with probability 1/10 
following an opponent's co-operation. HBEE's poor showing indicates 
that its unprovoked defections are not tolerated by the many provoc- 
able strategies inthe environment. It seems possible to observe, 
without trite moralization, that HBBE is a strategy which cheats but 
does not prosper. 

The optimization family members, whose program structures are 
not closely related, also span a range of performance. Respectively, 
ETH and CHA are fourth and fifth most robust overall. ETH's decision 
rule can be regarded as a partial but incomplete exercise in statis- 
tical optimization. While ETH defects, on a give move, with a proba- 
bility equal to the current frequency of an opponent's defection, ETH 
disregards joint outcomes and payoff structures alike. CHA is a 
relatively complex strategy, which does not defect lightly (recall, 
three conditions must be simultaneously fulfilled to provoke a 
defection from CHA). 

Observe that CHA's complexity does not always guarantee ap- 
preciably more effectiveness than EIH's simplicity. In Axelrod's 
second tournament, CHA ranks second, offensively, among sixty-three 
strategies; ETH, fourteenth. In the main interactive tournament, 
their respective offensive ranks are fifth and sixth. And ETH, 
according to the criteria of this study, is actually more robust than 
CHA. But on the whole, both rules are fairly successful. 

GRO, its great longevity notwithstanding, is not provocable 
enough to be successful in the interactive environment. And NYD, 
owing partly to its magnanimity (NYD always co-operates following 
three mutual defections) is thoroughly exploited in this environment. 

The hybrid family’s performance shows that strategies with 
alternative decision paths can be relatively effective. FRI, a hybrid 
of the two pure strategies (CCC and DDD), is seventh most robust in 
the interactive tournament. FRI's performance is thus incomparably 
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more successful than that of Cor DDD. It seems reasonable to 
observe that, as a strategic whole, FRI is certainly greater than the 
sum of its parts. Interestingly. although TES is compounded from TFT 
and an exploitive strategy, TES does not fare as well as FRI, perhaps 
because both of its decision paths entail compromises. FRI either co- 
operates consistently or defects permanently, whereas TES pursues a 
deterministic but not pre-determined course along either path. But in 
general, a potential seems to exist for effective hybrids. 

The maximization family's performance has been evaluated in 
some detail. One observes that, against a range of other strategies, 
a maximization strategy's robustness increases strictly with its 
initial co-operative weighting. Against a range of siblings and 
twins, however, a co-operative weighting that is too high begins to 
lose effectiveness. Thus, just as there is no best strategy indepen- 
dent of environment, there seems to be no best co-operative weighting 
independent of maximization family representation in the environment. 

In sum, four of the five competing families place members among 
the seven most robust strategies in the interactive environment. MAC 
and MAE, from the maximization family, rank first and third; SHV and 
TFT, from the tit-for-tat family, second and fifth; ETH and CHA, from 
the optimization family, fourth and fifth; FRI, from the hybrid 
family, seventh. The only family not to be represented among the most 
robust strategies is the probabilistic family. One can conclude that 
many types of strategic program structure and/or conceptual function 
are capable of respectable performance in the interactive environ- 
ment, with the exception of the pure and the purely random strate- 
gies. 

A fourth set of conclusions pertains to the performance of the 
maximization family. Given Axelrod's findings and conclusions, it is 
not surprising that MAC (and MAE) fare so well, that MEU fares 
indifferently, and that MAD fares poorly. However, prior to the 
analysis of the inner workings of the event matrix, the results of 
inter—familial competition were quite surprising, with respect both 
to the scores and to their distributions. 

It is possible to criticize the maximization family's perfor- 
mance on several grounds. 
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To begin with, it is clear that the game must be of sufficient 
length to allow the a posteriori probabilistic calculus to become 
maximally effective. The onset of perpetual mutual co-operation 
sometimes requires several hundred moves, whether in intra-familial 
or inter-familial competition. It must be admitted that MAC and MAE, 
as formulated herein, would be increasingly disadvantaged in games of 
correspondingly fewer moves. By the same token, of course, their 
performances ought to become more effective in games of correspond- 
ingly greater length. 

It might be instructive to conduct a further series of experi- 
ments, in which the maximization strategies would be obliged to 
calculate expected utilities after varying mumbers of weighted 
probabilistic encounters. A change in the initial number of such 
encounters results in a change in the quantity (and therefore in the 
quality) of statistical information upon which these strategies begin 
to act. Ultimately, one could learn how the properties of the event 
matrix change with respect both to alterations of the initial sum 
WXYZ, and to alterations of the length of the game itself. 

If a degree of anthropomorphic latitude were permitted, one 
could make different kinds of judgements about MAC. As formulated in 
the interactive tournament, MAC virtually sacrifices the first ten 
percent of its moves, prior to taking any decisions, in order to 
obtain some idea of its opponents’ play. One might be inclined to 
admire MAC's "confidence" or "courage". One might equally well be 
inclined to reprobate MAC's "boldness" or "bravado". 

However, it can be objected that anthropomorphisms are inap- 
propriate in formal dgqame-theoretic contexts. As earlier observed, 
Rapoport excludes the psychological orientations of the players from 
such contexts.” Rapoport refrains from imputing psychological mo- 
tives to a player's choice of strategy, in order to consider the 
question of strategic rationality strictly from the viewpoint of game 
theory. If one follows Rapoport's lead, then one would refrain from 
imputing human qualities to the strategies themselves. Whether such 
qualities be admirable or distasteful in a social context is arguably 


l Rapoport, 1966, p.103. 
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irrelevant toa formal game-theoretic discussion of strategic effec- 
tiveness. 

However, a more serious objection to the maximization family's 
performance can be made on game-theoretic grounds. In this enquiry's 
treatment of the Prisoner's Dilemma in the static mode, the maxiniza- 
tion of expected utility is defined as a collectively rational stra- 
tegy. Recall that an important attribute of a collectively rational 
strategy is its ability to contribute to the attainment of a Pareto- 
optimal outcome. In the Prisoner's Dilemma, Pareto-optimality is 
congruent with mutual co-operation. And in the static mode, the 
maximization of expected utility indeed attempts to realize this 
joint outcome (at least in theory), by prescribing that a strategist 
co-operate with the probability that the other strategist is collec- 
tively rational. Notwithstanding the difficulties associated with 
assigning a value to the foregoing probability statement, the strate- 
gic intent is clearly collectively rational. 

The objection arises in the iterated mode, in which the maxi- 
mization family exploits nice strategies that are not provocable 
(such as CCC). Is exploitiveness an attribute of collective rational- 
ity? At first blush, it appears not to be so. One reflexively as- 
sociates exploitation with individual rationality, since the ex- 
ploiter's gain is the exploitee's loss. When a non-exploitive stra- 
tegy such as TFT competes with CCC, the pair attains immediate and 
perpetual mutual co-operation, and hence realizes an iterated Pareto- 
optimal outcome. Is this not collectively rational? When a maximiza- 
tion strategy such as MAC competes with CCC, the outcomes from move 
101 onward are (D,c), with associated payoffs favouring the exploiter 
alone. Is this not individually rational? 

Thus, the objection can be raised that even the most co-opera- 
tively weighted members of the maximization family are "wolves in 
sheep's clothing". They appear to subscribe to collective rationality 
in the static mode, yet they exploit non-provocable strategies in the 
iterated mode. This objection, of course, is not made on moral 
grounds. However opposed one may be to the exploitation of the weak 
by the strong, for example in economic or military contexts, one 
neither approves nor disapproves of exploitation in a Prisoner's 
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Dilemma computer tournament. One merely studies it as a phenomenon. 
No moral judgement is rendered about simulated game-theoretic ex- 
ploitation; the issue is whether an exploitive strategy can claim to 
be collectively rational. 

The objection can be met with a logical argument. Simply 
stated, it amounts to this: the co-operatively weighted maximization 
strategies can indeed claim to be collectively rational, because 
their performances are consistent with the concept of collective 
rationality supported in this enquiry. By the working definition 
employed herein, a strategy is collectively rational if it maximizes 
its expected utility, and co-operates with the probability that its 
opponent is collectively rational. CCC, albeit a nice strategy, is 
not a collectively rational strategy (since it lacks the capacity to 
defend itself by retaliatory defections) 2 mus MAC's exploitation 
of CCC is not inconsistent with MAC's claim to collective rational- 
ity. 

The same objection can be countered if levelled against MAC's 
exploitation of its twin. If MAC is collectively rational, then why 
does it often fail to recognize its twin? It fails to do so because 
of its ironic blind-spot. That there is no best strategy independent 
of environment means that every strategy admits of some weakness. In 
MAC's case, its strength against others causes its weakness against 
itself. Since the MAC-MAC pair begins its game by playing one hundred 
weighted random moves, certain initial distributions of outcomes 
cause the twins to react as if they were competing against members of 
the probabilistic family. Thus MAC is susceptible to unfortunate but 
understandable occurrences of mistaken identity. MAC is collectively 
rational, but prone to err in assessing the probability of its twin's 
collective rationality. 

A strategy that is consistently nice (like CCC) may arouse more 
sympathy than a strategy that is consistently rude (like DDD; 
nonetheless, MAC’s perpetual defection against both indicates that, 
in MAC's estimation, neither pure strategy is collectively rational. 


13 Recall that a collectively rational player, while desirous of 
mutual co-operation, must be able to protect himself against indi- 
vidually rational and irrational players. 
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At the same time, MAC's perpetual co-operation with TFT indicates 
that MAC deems TFT to be a collectively rational strategy. 

And this leads to a vital conclusion. TFT is indeed a collec- 
tively rational strategy, although not an exploitive one. Collective 
rationality is not uniquely defined. In consequence, one can conclude 
that there is no "most collectively rational” strategy independent of 
environment. 

If an environment is universally "friendly", i.e. if it is 
populated exclusively by nice strategies, then iterated competition 
ceases to exist. Immediate perpetual mutual co-operation is attained 
by all pairs, and all scores are tied. All strategies win; none lose. 

But let one rude strategy be introduced into the environment, 
and equality of outcome vanishes. Nice strategies that are non- 
provocable are exploited by the rude one, while provocable strategies 
retaliate against it. Winners and losers emerge. 

Now let a nide strategy be introduced. If the nide strategy co- 
operates with the provocable, retaliates against the rude, and 
exploits the non-provocable, then the nide strategy wins. 

If an environment is overwhelmingly "hostile", i.e. if it is 
populated solely by rude and unforgiving strategies, then iterated 
competition also tends to cease. All pairs lock into perpetual mutual 
defection. All strategies lose; none wins. 

A collectively rational strategy exhibits different performance 
characteristics in differently-constituted environments. In a friend- 
ly environment, collective rationality should not manifest exploi- 
tiveness. But if the environment contains individually rational 
and/or irrational strategies, then a collectively rational strategy 
must be exploitive to be successful. 

One final property of the maximization family bears mention 
anew. It is a remarkable dual-aspect property, unique to this family, 
which provides partial compensation for the maximization strategies’ 
intermittent lack of recognition of their siblings and twins. Con- 
sider consecutive event matrices of a game between any maximization 
family member (MAX) and any opponent (QPP), after any number of moves 
have been made, wherein the latest outcome is a mutual defection: 
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Game 12.1 — MAX versus OPP, Consecutive Event Matrices 


W+X+Y+Z moves: OPP WX 241 moves: OPP 
Cc d c 
W X C W X 
MAX MAX 
Y Z D Y Al 
EUC = (WR + X5)/ (WAX) EUC = (WR + X5)/( HX) 
EUD = (YT + ZP/(HZ) EUD = [YT +(41)A/ (42441) 


In game 12.1, it is understood that T> R> P > S ami 
R>1/2(St7T) and ¥,X,Y,Z > 0.) 
It can readily be shown, algebraically, that 


(YT +(Z+1) P) / (WAL) < (YT + ZPY/ OAD 


The expression for EUD is a decreasing monotonic function of Z. It is 
bounded below by the value of P, which it approaches in the limit as 
Z becomes very large. In other words, in any iterated Prisoner's 
Dilemma, the expected utility of defection is actually decreased by 
mutual defection. This decrease, in turn, increases the maximization 
strategy's propensity to co-operate. Thus mutual defection contri- 
butes to co-operativeness. 

Now consider consecutive event matrices of a game wherein the 
latest move is mutual co-operation. (In game 12.2, as in game 12.1, 


it is understood that T>R>P>Sand R > 1/2(S+7) and 
WXYZ > 0.) 
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Game 12.2 — MAX versus OPP, Consecutive Event Matrices 


WHXt+Y+2 moves: o opp HX Y-Z241 moves: opp 
C C 
W C W+1 X 
MAX MAX 
D Y D Y Z 
EUC = (WR + X5y / (AX) EUC = [(W1)R + X91)/(W1+% 


EUD = (YT + ZP)/(¥+Z) EUD = (YT + ZD/(42) 


Similarly, it can be shown that 
[(W1)R + X9)/ 01+% > (WR + X5)/( WX 


The expression for EUC is an increasing monotonic function of W. It 
is bounded above by the value of Å, which it approaches in the limit 
as W becomes very large. In other words, in any iterated Prisoner's 
Dilemma, the expected utility of co-operation is increased by mutual 
co-operation. In consequence, once a maximization strategy participa- 
tes in mutual co-operation by virtue of expected utility, it becomes 
nice instead of nide. It will never be the first to defect following 
such an outcome. 

In sum, maximization strategies possess a property that in- 
creasingly favours mutual co-operation as the length of the game 
increases, by dint of either mutually defective or mutually co- 
operative outcomes. 

A fifth and final set of conclusions pertains to the game- 
theoretic approach to conflict research. One can place this enquiry 
in perspective by envisaging two very large sets, side by side. One 
set is that of actual human conflicts; the other, that of conflict 
models. 
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Human conflict, interpreted in its broadest sense, encompasses 
a host of phenomena that range from intra-personal ethical quandaries 
to international warfare. Any number of individuals and/or groups can 
be involved in situations of conflict, whether of principle, of 
interest, of ideology, of nationality, and so forth. Conflicts can be 
expressed verbally, violently, symbolically, structurally, and in 
numerous other ways. 

Models are as diverse as the conflicts they represent. And 
models, like conflicts themselves, are susceptible to change. Game— 
theoretic models (as well as other types of conflict models) exhibit 
variation and alteration, if not amelioration, through research and 
development. An effective conflict model sheds coherent light on some 
aspect or aspects of a given conflict, and may thus contribute to the 
formulation of a resolution. But actual conflicts are resolved by 
people, not models. The existence of a resolution to a given conflict 
does not guarantee its implementation. Nonetheless, the existence of 
a resolution, or the belief that a resolution exists, can profoundly 
influence the will to resolve a conflict. 

A sub-set of the set of conflict models is game-theoretic in 
character. A sub-set of this sub-set deals with Prisoner's Dilemmas. 
The sub-sub-set of Prisoner's Dilemmas can be further partitioned 
according to the number of prisoners involved. This study treats that 
partition which contains two-person Prisoner's Dilemmas, in both 
static and iterated modes. Although twenty strategies are involved in 
the interactive tournament, the competitions take place pair-wise; 
hence the tournament is a multiple, iterated two-person Prisoner's 
Dilemma. 

Prisoner's Dilemmas are rife in the set of actual human con- 


flicts. Hobbesian wars of amnia contra omnes å the nuclear arms 


M E.g. see T. Hobbes (1651), Leviathan, (ed. M. Oakeshott), 
Basil Blackwell, Oxford, 1957, Part I, Chapter XIII: "Hereby it is 
manifest, that during the time men live without a common purpose to 
keep them all in awe, they are in that condition which is called war: 
and such a war, as is of every man, against every man." Among other 
modern scholars, Rawls identifies this Hobbesian "state of nature” as 
an N-person Prisoner's Dilemma. See J. Rawls, A Theory of Justice, 
Clarendon Press, Oxford, 1972, p.269. 
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race,” the manufacture and sale of conventional arms, among many 


other conflicts, can be modelled as Prisoner's Dilemmas that involve 
varying numbers of people, groups and nation-states.” 

In perspective, then, this enquiry belongs to a small sub-sub- 
sub-set of a large universe of conflict models, which corresponds 
with some sub-set of a large universe of actual human conflicts. This 
enquiry is concerned with strategic interaction in the Prisoner's 
Dilemma as a game-theoretic dimension of conflict research; it does 
not attempt to relate its findings to human conflicts per se. The 
existence of such relations is naturally implied, since a dimension 
of strategic interaction is superimposed upon many—if not most— 
human conflicts. 

However, an articulation of such relations is a task that lies 
far beyond the scope of the study at hand. Any such articulation must 
eventually re-introduce vexed questions associated with intra-per- 
sonal and inter-personal comparisons of utilities, and cannot avoid 
confronting further paradoxes of human rationality. 

Rapoport makes a cogent observation about the potential ap- 
plicability, and inapplicability, of game-theoretic conflict models: 


"At present game theory has, in my opinion, two important 
uses, neither of them related to games nor to conflict 
directly. First, game theory stimulates us to think about 
conflict in a novel way. Second, game theory leads us to 


E The nuclear arms race between the U.S.A. and the U.S.S.R. is 
a classic two-nation Prisoner's Dilemma. E.g. see J. Wiesner & H. 
York, “National Security and the Nuclear Test-Ban', Scientific 
American, 211, October 1964, pp.27-35. For a game-theoretic treatment 
of the allied doctrine of "brinkmanship", see e.g. M. Deutch & R. 
Lewicki, ~ "Locking-in" Effects During a Game of Chicken’, Journal of 
Conflict Resolution, 14, 1970, pp.367-379. 


16 The creation, maintenance and exploitation of global conven- 
tional arms markets, by individuals and governments, can be viewed as 
a "tragedy of the commons". The generalized socio-economic model is 
developed and presented by G. Hardin, ~The Tragedy of the Commons", 
in A. Baer (ed.), Heredity and Society, The Macmillan Company, N.Y., 
1973, pp.226-239. It can be observed that the tragedy of the commons 
is itself an Mperson Prisoner's Dilemma. 


Aron, for example, recognizes the general usefulness of game— 
theory to political science, in that the theory permits "abstract 
formulation of the dialectic of antagonism”. See R. Aron, Peace and 


War, Weidenfeld and Nicolson, London, 1966, p.772. 
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some genuine impasses, that is, to situations where its 
axiomatic base is shown to be insufficient for dealing 
even theoretically with with certain types of conflict 
Situations. These impasses set up tensions in the minds 
of people who care. They must therefore look around for 
other frameworks into which conflict situations can be 
cast. Thus, the impact is made on our thinking processes 
themselves, 4 rather than on the actual content of our 
knowledge." 


Rapoport 's observation is borne out in ensuing literature on conflict 
research. One can perceive an impact of game-theoretic conflict 
models generally, and of Prisoner's Dilemmas particularly, on contem- 
porary philosophical conceptions of rationality. 

For example, contemplation of the Prisoner's Dilemma leads the 
philosopher Davis to conclude that 


. . Co-operation between individuals with clashing 
interests may be pore rationally defensible than has been 
widely thought." 


This enquiry is intended as a modest contribution to the 
understanding of the Prisoner's Dilemma conflict model. It reveals, 
perhaps above all, that much work remains to be done to further 
develop this understanding. The emergence of a clearer and more 
detailed picture of strategic interaction could lend some impetus, in 
turn, to the demanding task of implementing effective strategies in 
flesh-and-blood Prisoner's Dilemmas, in order that actual conflicts 
be resolved and future ones circumvented. 

Although this enquiry in no way purports to treat such momen- 
tous human problems, it strives, at least, to ponder an associated 
game-theoretic conflict model in a rigorous fashion. Let it conclude 
with the oft-tited, perhaps prescient words of Braithwaite: 


"And if anyone is inclined to doubt whether any serious 
enlightenment can come from the discreetly shaded candles 
of the card-room, I would remind him that, three hundred 
years ago this year (1954), that most serious of men, 
Blaise Pascal, laid the foundations of the mathematical 
theory of probability ina correspondence with Fermat 
about a question asked him by the Chevalier de Méré, who 
had found that he was losing at a game of dice more often 


18 Rapoport, 1960, p.242. See also K. Boulding, Conflict and 
Defense, Harper & Row, N.Y., 1963, pp.56-57. 


19 L. Davis, “Prisoners, Paradox, and Rationality', American 
Philosophical Quarterly, 14, 1977, pp.319-327. 
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than he had expected. No one today will doubt the inten- 
sity, though he may dislike the colour, of the (shall I 
say) sodium light cast by statistical mathematics, direct 
descendant of theory of games of chance, upon the social 
sciences. Perhaps in another three hundred years’ time 
economic “and political and other branches of moral 
philosophy will bask in radiation from a source—theory 
of games of strategy—whose prototype was kindled round 
the poker tables of Princeton." 


a R. Braithwaite, Theory of Games as Tool for the Moral Philo- 
sopher, Cambridge at the University Press, 1955, pp.54-55. For an 
elucidation of the Chevalier de Méré's problem, and its solution, see 
e.g. Rapoport, 1960, pp.113-115. 
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Appendix One - Glos j 


The Probabilistic Family 


DDD: The strategy of pure defection. Defects unconditionally 
(equivalently, with probability equal to unity). 

TQD: The strategy of three-quarter random defection. Co-oper- 
ates and defects randomly, with respective probabilities 1/4 and 3/4. 

RAN: The equiprobable random strategy. Co-operates and defects 
randomly with probability 1/2. 

mC: The strategy of three-quarter random co-operation. Co- 
operates and defects randomly, with respective probabilities 3/4 and 
1/4. 

CCC: The strategy of pure co-operation. Co-operates uncondi- 
tionally (equivalently, with probability equal to unity). 


The Tit-for-Tat Family 


TFT: The strategy of tit-for-tat, the familial prototype. It 
co-operates on the first move, and plays next whatever its opponent 
played previously. 

TIT: The strategy of tit-for-two-tats. It co-operates on the 
first two moves, and defects only after two consecutive defections by 
its opponent. 

BEE: The strategy that "burns both ends". It plays exactly as 
IFT, but also defects randomly with a probability of 1/10 following 
each mutual co-operation. 

SHU. Shubik's strategy. It plays exactly as TFT, but increments 
its retaliatory defections. It defects once following its opponent's 
first departure from mutual co-operation, defects twice consecutively 
following its opponent's second departure, and defects n times 
consecutively following its opponent's nº departure. 

TAT: The strategy of tat-for-tit. It defects on the first move, 
and plays next the opposite of whatever its opponent played previous- 
ly. 
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The Maximization Family 


MEU: The strategy of maximizing expected utilities. It plays 
randomly during the first 100 moves, with equal random probability of 
co-operating or defecting on each move. It then calculates its 
expected utilities of co-operation and defection, using the frequen- 
cies of past outcomes as the (a posteriori) probabilistic component. 
It then plays according to the greater expected utility, and updates 
the appropriate outcome frequency after each move. 

MAD: The strategy of maximizing expected utilities, weighted at 
defection. It plays exactly as MEU, but its initial probabilistic 
weighting is 1/10 random co-operation and 9/10 random defection. 

MAE: The strategy of maximizing expected utilities, weighted at 
equiprobable expectation. It plays exactly as MEU, but its initial 
probabilistic weighting is 5/7 random co-operation and 2/7 random 
defection. 

MAC: The strategy of maximizing expected utilities, weighted at 
co-operation. It plays exactly as MEU, but its initial probabilistic 
weighting is 9/10 random co-operation and 1/10 random defection. 


The timization Famil 


NYD: Nydegger's strategy. It plays tit-for-tat for the first 
three moves, save that if it was the only one to co-operate on the 
first move and the only one to defect on the second move, it defects 
on the third move. After that, its choice is determined from the 3 
preceeding outcomes in the following manner. Let A be the sum formed 
by counting the other's defection as 2 points and one's own as 1 
point, and giving weights of 16, 4 and 1 to the preceeding three 
moves in chronological order. The choice can be described as defect- 
ing only when Å equals 1, 6, 7, 17, 22, 23, 26, 29, 30, 31, 33, 38, 
39, 45, 49, 54, 55, 58, or 61. 

GRO: Grofman's strategy. It co-operates on the first move. 
After that, it cooperates with probability 2/7 following a dissimilar 
joint outcome, and always co-operates following a similar joint 
outcome. 
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CHA: Champion's strategy. It co-operates on the first ten 
moves, amd plays tit-for-tat on the next fifteen moves. From move 
twenty-six onward, it co-operates unless all of the following condi- 
tions are true: the opponent defected on the previous move, the 
opponent 's frequency of co-operation is less than 60%, and the random 
number between zero and one is greater than the opponent's frequency 
of co-operation. 

ETH: Eatherly's strategy. It co-operates on the first move, and 
keeps a record of its opponent's moves. If its opponent defects, it 
then defects with a probability equal to the relative frequency of 
the opponent's defections. 


The Hybrid Famil 


FRI: Friedman's strategy. It co-operates until its opponent 
defects, after which it defects for the rest of the game. 

TES: Gladstein's strategy, called "tester". It defects on the 
first move. If its opponent ever defects, it "apologizes" by co- 
operating, and plays tit-for-tat thereafter. Otherwise, it defects 
with the maximum possible relative frequency that is less than 1/2, 
not counting its first defection. In other words, until its opponent 
defects, it defects on the first move, the fourth move, and every 
second move after that. 


Appendix Two — Table of Raw Scores 


Table A2.1 — Matrix of Raw Scores, Main Tournament 


DOD TD RAN DX OC IFT TIT BRE SHU TAT MEU MAD MAE MAC WYD GRO CHR ETH FRY TES 
JRD 1000 1952 2992 3996 5000 1004 1008 1004 1176 4996 1212 1024 1272 1380 3664 3292 1040 1004 1004 1004 
RO 762 1727 2634 3550 4470 1673 2324 1520 948 3580 920 777 1025 1098 4484 3023 2426 2405 748 1693 
RAW 502 1354 2243 3139 3972 2193 3129 2098 713 2299 659 529 743 825 3968 2681 3200 3076 523 2161 
MC 251 1095 1914 2685 3472 2706 3295 2436 537 1124 405 307 479 570 3460 2586 3476 3280 248 2721 
QC 0 795 1542 2292 3000 3000 3000 2700 3000 O 120 30 243 264 3000 3000 3000 3000 3000 1500 
TFT 999 1673 2193 2701 3000 3000 3000 1036 3000 2250 1108 1019 1267 2965 3000 3000 3000 3000 3000 2999 
TTT 998 1444 1874 2365 3000 3000 3000 2662 3000 1800 1143 1002 1204 2935 3000 3000 3000 3000 3000 1500 
ARE 999 1690 2433 2766 3200 1041 3197 1033 1174 2367 1148 1018 2989 3140 3242 2735 3215 3225 1027 1049 
SU %6 1878 2913 3877 3000 3000 3000 974 3000 4529 1125 1008 1263 1322 3000 3000 3000 3000 3000 2999 
TAT 11065 2219 3534 5000 2250 2800 2132 384 2000 230 42 368 466 4984 3266 2837 2851 6 21 
HEV 947 1995 2899 3940 4920 1113 1538 1133 1160 4750 2384 1003 2396 1087 4875 3241 2610 2294 955 1175 
AAD 994 2037 3004 3912 4980 1024 1147 1013 1168 4972 1181 1029 1266 1332 4940 3273 1193 1243 996 1013 
ARE 932 1970 2903 3899 4838 1272 1624 2544 1183 4628 2356 987 2594 2123 4852 3232 2870 3000 939 1312 
ARC 905 1878 2900 3750 4824 2965 2995 2665 1237 4556 1741 971 1849 1807 4814 3270 3000 2893 955 2926 
MD 334 764 1543 2305 3000 3000 3000 2637 3000 14 165 45 217 279 3000 3000 3000 3000 3000 2500 
GRO 427 1233 2111 2721 3000 3000 3000 2385 3000 1156 586 468 672 670 3000 3000 3000 3000 3000 2995 
CHA 990 1541 2010 2281 3000 3000 3000 2660 3000 1697 2140 988 2215 2210 3000 3000 3000 3000 3000 2987 
ETH 999 1475 1875 2445 3090 3000 3000 2640 3000 1681 1614 988 2263 2505 3000 3000 3000 3000 3000 2999 
FRI 999 2023 2933 4033 3000 3000 3000 1027 3000 4991 1195 1031 1259 1325 3000 3000 3000 3000 3000 1007 
TES 999 1688 2156 2731 4000 2999 4000 1044 2999 2246 1175 1008 1307 2951 2500 2995 3007 2999 1002 2998 
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Appendix Three — Efficiency Tables 


Table A3.1 — 19 Appearances in 190 Sub-Tournaments 
Involving 2 Strategies 


1 2 HF 


TFT 19 0 100 

SHU 19 0 100 

FRI 19 0 100 

TIT 15 4 78.9 
MAE 15 4 78.9 
CHA 14 5 73.7 
ETH 14 5 73.7 
TES 13 6 68.4 
MEU 12 7 63.2 
GRO 12 7 63.2 
MAC 10 9 52.6 
NYD 10 9 52.6 
CC 9 10 47.4 
DDD 8 11 42.1 
RAN 8 11 42.1 
TAT 8 11 42.1 
MAD 8 11 42.1 
TRD 7 12 36.8 
TC 6 13 31.6 
BBE 1 18 5.3 


Table A3.2 — 171 Appearances in 1,140 Sub-Tournaments 
Involving 3 Strategies 


1 2 


MAE 122 26 23 78.9 
TES 100 55 16 74.6 
ETH 85 75 11 71.6 
MEU 90 55 26 68.7 
CHA 78 74 19 67.3 
TTT 75 72 24 64.9 
MAC 72 67 32 61.7 
GRO 44 80 47 49.1 
MAD 47 51 73 42.4 
TD 40 63 68 41.8 
RAN 35 57 79 37.1 
DDD 40 46 85 36.8 
TAT 32 44 95 31.6 
NYD 35 34 102 30.4 
TC 27 43 101 28.4 
ccc 36 16 119 25.7 
BBE 1 44 126 5.3 


263 


Table A3.3 — 969 Appearances in 4,845 Sub-Tournaments 
Involving 4 Strategies 


1 2 3 4 EFF% 


FRI 604. 213 125 27 81.3 
MAE 639 129 135 66 79.5 
SHU 492 333 132 12 78.2 
TFT 371 450 146 2 74.3 
TES 414 365 158 32 73.3 
ETH 330 403 226 10 69.6 
MAC 386 317 205 61 68.7 
MEU 420 274 173 102 68.1 
CHA 313 398 220 38 67.3 
TIT 241 346 311 71 59.4 
GRO 113 220 466 170 42.8 
MAD 182 198 254 335 41.1 
TRD 151 189 349 280 40.6 
DDD 142 198 247 382 36.8 
RAN 97 20 329 342 35.2 
TAT 100 175 240 454 30.6 
mC 80 93 209 587 21.8 
NYD 87 60 247 575 21.6 
BBE 5 108 362 494 20.4 
CCC 97 30 173 669 18.0 


Table A3.4 — 3,876 Appearances in 15,504 Sub-Tournaments 
Involving 5 Strateqies 


1 2 3 4 5 EFF% 
MAE 2290 624 425 406 131 79.3 
FRI 2050 797 640 299 90 78.5 
SHU 1570 1261 749 268 28 76.3 
MAC 1590 1059 786 364 77 74.0 
TES 1229 1280 976 338 53 71.2 
IFT 952 1564 1114 229 17 70.7 
ETH 1039 1334 1190 310 3 70.0 
HA 1075 1239 1177 343 42 69.1 
MEU 1359 1071 635 553 258 67.5 
TIT 622 980 1334 800 140 57.4 
MAD 519 661 726 962 1008 41.8 
ID 332 638 855 1241 810 39.9 
GRO 267 468 1048 1612 481 39.9 
DDD 331 643 663 867 1372 35.1 
RAN 219 446 806 1378 1027 33.6 
TAT 220 470 494 1200 1487 29.0 
BBE 22 268 835 1305 1446 24.9 
TC 141 246 378 935 2176 19.3 
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Table A3.5 — 11,628 Appearances in 38,760 Sub-Tournaments 
Involving 6 Strategies 


1 2 3 4 5 6 


; 


MAE; 6253. 2081 1161 1186 750 197 79 
MAC 4996 2852 1976 1324 403 77 78 
FRI 5341 2354 1922 1292 570 149 77 
SHU 4085 3516 2284 1328 365 50 76 
CHA 2978 3035 3476 1800 312 27 71 
ETH 2651 3328 3652 1753 239 5 71 
TES 2826 3219 3166 1848 507 62 70 
TFT 1887 3848 3839 1671 357 26 68 
MEU 3217 3341 1646 1665 1283 476 67 
TIT 1299 2220 3307 3255 1338 209 57. 
GRO 399 1097 1899 3983 3664 586 40 
MAD 1041 1787 1500 2116 2460 2724 40 
TRD 615 1386 1702 2886 3422 1617 39. 
DDD 544 1549 1454 1929 1924 4228 32. 
RAN 333 728 "1338 2983 4086 2160 32. 
BBE 42 446 1718 2986 3118 3318 27. 
TAT 366 1009 1034 1545 3733 3941 27. 
NYD 118 308 595 1222 3972 5413 17. 
TC 210 340 590 1121 3303 6064 16. 
ccc 147 175 342 782 2865 7317 Il; 
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Table A3.6 — 27,132 Appearances in 77,520 Sub-Tournaments 
Involving 7 Strategies 


1 2 3 4 5 6 7 EFF% 
MAC 12293 6222 3839 2934 1461 333 50 81.3 
MAE 13365 5308 2644 2447 2174 980 214 79.8 
FRI 11032 5342 4606 3157 2076 742 177 77.2 
SHU 8603 7867 5079 3642 1516 389 36 77.2 
CHA 6490 6222 7352 5386 1497 175 10 73.0 
ETH 5333 6820 7951 5496 1395 135 2 72.1 
TES 5021 6766 6580 5758 2422 533 52 69.4 
TFT 3025 7134 8897 5811 1886 343 36 68.2 
MEU 5754 7830 3907 3399 3522 2113 607 66.7 
TIT 2112 4108 5949 7935 5160 1665 203 57.0 
MAD 1560 3543 3238 3652 4807 5240 5092 40.4 
GRO 520 1559 3007 5714 9516 6402 414 40.2 
TD 808 2326 3241 4443 7203 6249 2862 39.0 
DDD 608 2574 3144 3031 4467 4614 8694 31.8 
RAN 426 896 1817 3356 8324 8785 3528 30.6 
BEE 58 703 2292 5328 6627 5957 6167 29.6 
TAT 547 1416 1933 2189 4085 8775 8187 25.6 
NYD 72 304 810 1573 3778 9441 11154 16.5 
TIC 189 393 709 1225 3063 7656 13897 14.4 
cece 90 94 440 988 2503 6951 16066 10.9 
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Table A3.7 — 50,388 Appearances in 125,970 Sub-Tournaments 
Involving 8 Strategies 


l 2 3 4 J 6 7 8 ER 


24118 11002 6177 4618 3146 128 180 19 839 
22822 10531 5072 4109 4015 2681 992 166 80.3 
14803 14207 9027 708 3768 108 % 14 7.4 
18226 98% 0418 6356 418 2416 67 152 73 
11166 10566 12612 10669 4594 720 60 1 74.4 
8613 11497 13544 11382 490 810 50 2 Bi 
7066 11089 11207 11065 7302 2209 45 3 68.9 
3840 10310 15743 12739 5841 164 24 27 6.9 
8033 14164 7650 5792 6401 5032 246 670 66.5 
8916 12794 12552 5748 1551 130 57.1 
1805 5332 5582 5330 7810 7762 9689 8078 40.1 
522 1669 3041 6646 12486 16522 8449 3 39.2 
783 2939 4547 5954 10290 13061 9267 3547 BS 

68 7% 2417 6944 10962 10819 9609 8794 3.9 
451 3029 5037 4508 6501 7718 7598 15466 30.5 
420 821 2015 3573 8291 16768 13404 50% 29,5 
627 1515 2654 2860 4688 9030 15441 13573 24.3 
29 197 70 1513 3459 872 18274 17449 16.2 
143 340 57 1113 228 6937 15365 23286 13.5 
36 % 184 828 2398 5069 12750 29098 9.8 
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Table A3.8 - 75,582 Appearances in 167,960 Sub-Tournaments 
Involving 9 Strategies 


l 2 3 4 j 6 7 å 9 ET 


MAC 38131 15736 8597 5573 4708 2206 572 5% 3 
MAE 31536 16673 7967 5859 5626 4769 2347 77 88 80.8 
SHU 20881 21017 13060 10510 6751 2639 622 100 2 
FRI 24281 14599 12813 9957 7354 4497 1913 479 89 
CHA 15374 14734 17942 15995 9178 2100 246 13 0 75.6 
ETH 11107 15962 18819 17679 9408 265 332 10 0 
7777 14673 15421 16171 13968 6246 1381 28 17 
3943 11754 21262 21043 12227 4245 990 109 9 67.9 
8746 20204 12365 8228 8873 8630 5643 2396 497 66.4 
2452 6994 11196 15725 20251 13200 4575 1122 67 57.1 
1536 6068 7700 6602 9043 11614 10279 12160 10580 39.3 
406 1450 3357 7047 13746 21195 20340 793 118 39.0 
630 2755 5083 6913 10454 17595 1742 11133 3597 32.0 
702 2042 6426 13416 16030 14712 11927 10285 31.9 
212 2424 5897 6094 6901 11203 11005 9673 22173 29.3 
293 647 1640 2863 5898 16244 25914 17014 5067 28.5 
597 1308 2660 3108 4564 7837 15252 21339 1897 23.0 
3 71 398 1214 2666 6555 13720 28139 22808 15.7 
89 180 397 753 1388 4353 12364 25710 30348 12.8 
9 0 39 224 1513 4520 8328 17737 43212 392 
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Table A3.9 — 92,378 Appearances in 184,756 Sub-Tournaments 
Involving 10 Strategies 


l 2 3 4 5 6 7 8 9 10 ER 


MAC 48981 18468 -9999 -5672 5111 3022 949 169 7 0 
MAE 35411 23570 10181 7020 6460 6017 3858 1493 39 2 813 
SHY 24361 25207 15892 12170 9008 4307 128 165 2 0 
FRI 25997 18180 14466 12839 9383 6998 3150 1101 216 48 
CHA 17082 16832 21127 19435 13000 4221 67 8 1 0 76.6 
ETH 11538 18032 21656 21768 13837 4691 767 89 0 0 
6724 15254 16875 18740 18540 12157 339 607 78 7 
3231 10740 22751 25645 19456 0048 2061 1 X 1 6.9 
7461 22688 16227 10020 9994 11019 8339 4572 1758 300 66.4 
1775 6151 11125 16241 22881 21515 9204 288 587 18 57.1 
932 5385 8292 7327 8869 13181 13269 11694 13396 10033 39.3 
22 1071 2353 5227 11132 20304 26604 19762 5587 3 3.5 
370 1985 4270 6302 9458 16892 22413 17352 10310 3026 37.8 
1447 4333 11066 18122 19621 16137 11785 9415 32.5 
52 131 5040 6514 6817 9984 12785 12779 12005 25091 28.3 
171 406 904 1988 3735 9560 24650 29770 16969 432 27.7 
455 979 1884 2581 3407 6543 10551 20650 24282 21046 21.8 
0 13 110 532 1637 4132 9526 18188 33868 24372 15.3 
2 8 180 3N 680 1827 5777 16309 32065 35060 11.7 
1 0 0 27 260 2216 5863 10585 21597 51829 8.5 
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Table A3.10 — 92,378 Appearances in 167,960 Sub-Tournaments 
Involving 11 Strategies 


l 2 3 4 5 6 7 8 9 10 


= 


AAC 51491 17887 9448 5181 39% 3084 1048 22 2l 0 
MAE 32357 22884 10656 6935 6248 5775 4670 2102 648 9 
SHU 23552 24482 16053 11800 8963 5393 1809 32 3 1 
FRI 22190 19110 13934 12783 9988 7808 4519 1514 47 70 
CHA 15369 15934 20411 19263 14272 5931 1064 130 4 0 

9639 16562 20590 21494 15896 6611 1380 197 9 0 

2024 7913 19333 24429 22358 12229 3220 744 126 2 

4472 12623 15148 17808 18866 15712 6381 1193 164 U 

4988 19940 17480 10283 9151 10729 9757 6167 2861 094 

987 4113 8577 13631 20009 24285 14557 4581 142 209 4 97.1 

390 3427 6900 6643 7115 11081 13985 11422 9922 12679 
59 547 1475 2912 7404 15317 22087 24071 15288 3210 8 3.2 
1140 2742 4434 6644 11672 20162 21513 14680 7384 1817 37.4 
4 187 906 2479 6590 13170 20889 18887 13387 9199 6686 33.3 
11 43 2856 5102 5555 7716 11781 12424 12165 10859 23473 2.5 
59 174 399 950 1891 4357 12771 28750 26979 12940 3108 27.0 
223 600 1006 1593 2196 3801 7833 11652 22142 22269 19063 20.8 
0 0 15 8 538 2212 5303 9866 18791 34771 20797 15.0 
10 23 56 126 261 573 1994 6420 18236 32438 32241 1.1 
0 0 0 0 15 499 2754 5797 10655 20979 51679 7.9 
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Table A3.11 — 75,582 Appearances in 125,970 Sub-Tournaments 
Involving 12 Strategies 


l 2 3 4 4) 6 7 8 9 0 U 12 ER 
WAC 44390 14442 -7149 -3852 2582 2037 98 18 24 0 0 0 91.8 
SHU 18720 19300 13502 9578 6995 5060 1973 436 18 0 | 0 82.8 
ME 23880 19982 9158 5771 4884 4538 4097 2310 780 163 19 0 82.4 
CHA 11212 12350 16130 15787 12457 6163 1311 151 2 0 0 0 7.1 
FRI 15039 16367 11673 9968 8559 6734 4717 198 47 146 12 2 7.8 
EM 6416 12300 16000 17325 14543 6934 1789 243 32 0 0 0 7.1 
IFT 945 4513 12881 18665 19306 13829 4382 821 216 24 0 0 67.9 
TES NN 8249 10946 13523 15264 14742 8477 1908 25 17 0 0 67.3 
MV 2546 13781 15169 0534 7336 0247 8778 6298 324 1262 32 % 665 
ITT 395 220 5058 9136 13921 19812 16762 6066 1864 479 69 0 957.0 
MD 125 1592 4314 4904 4651 7277 10328 10493 9228 7462 9656 5552 39.2 
GO 6 159 622 1385 3286 8595 15700 18683 16999 8747 1400 0 3.0 
mo 62 461 1331 2289 3837 6912 12962 18194 14896 9320 4343 975 3.2 
BEE 0 53 456 1173 2998 6829 14556 18091 13507 8583 5474 382 3.2 
DD O O 1103 264 3477 4361 7233 9939 11219 9088 0394 17822 26.5 
AM 15 48 19 337 688 1493 4454 13838 24482 19852 0511 1745 26.2 
Tar 03 294 385 79 1087 1782 38% 6750 10365 19415 16372 14363 19.8 
mm o 0 0 7 3 448 2464 5539 8171 15305 28803 14809 14.6 
mm o 1 10 4 62 158 506 1712 5510 17965 25323 2475 10.6 
ær 50 0 0 0 0 30 614 2438 4616 8709 17266 41909 7.4 


Table A3.12 — 50,388 Appearances in 77,520 Sub-Tournaments 
Involving 13 Strategies 


l 2 3 4 3 6 7 8 § 0 UH R NFR 
MC 31377 9561 4329 2169 1340 963 563 80 6 0 0 0 0 93.6 
SH 12032 12451 9457 6382 4491 3460 167 406 3 0 0 0 0 83.6 
KAE 14086 14423 6323 4039 3114 2935 2640 1980 658 171 19 0 0 8.0 
CHA 6583 7748 10509 10601 8702 4893 119 12 22 4 0 0 0 78.8 
FRI 7993 11068 8433 6600 5556 4780 354 1823 499 9 10 1 0 78.0 
EM 3320 7400 9958 11594 10395 5738 170 283 23 l 0 0 0 76.1 
IFT 316 1916 6481 11581 12453 11699 4893 841 158 49 1 0 0 67.8 
TES 737 4065 6293 8153 9963 10382 7970 2441 30 46 0 0 0 66.8 
AY 937 7439 10485 5758 5050 4883 6155 509 2777 1308 44 9 6 66.6 
MT 99 699 2158 4690 7849 12043 13628 6711 1738 653 116 4 0 56.9 
MD 17 473 1896 2732 2685 4014 6252 7597 6253 5677 4538 5545 2709 3.2 
GO O 14 182 428 1159 3022 7523 1184 12314 9418 4078 40 0 3.5 
RO 14 154 516 902 1531 2831 6078 10604 12693 8539 4372 1830 324 37.0 
RE 0 8 17% 55 971 2638 6696 12507 10904 7323 4210 2762 1679 35.1 
Mp 0 0 193 1040 1707 2154 3873 5540 6815 6591 5071 5497 11107 25.7 
MN 1 5 R 7 1% 389 1236 3672 11330 17693 1093 4118 67 25.7 
Ar 2 103 130 26 361 629 1305 3091 4388 7937 13612 9954 0633 18.8 
m oo 0 0 0 1 18 446 2069 392 5676 11142 19075 8040 14.4 
mw 0 0 0 2 14 30 82 302 963 3513 12297 16747 16441 9.6 
æ 0 0 0 0 | 0 67 539 1702 2823 5917 11497 27883 6.9 


Table A3.13 — 27,132 Appearances in 38,760 Sub-Tournaments 


Involving 14 Strategies 
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l 2 3 4 5 6 7 8 9 0 " 2 #13 4 E 
MAC 18087 5033 192 - 952 500 35 29 5 0 0 0 0 0 0 %.2 
SHV 6111 6656 5368 3541 2349 1780 105 m 27 0 0 0 0 0 84.5 
ME 6550 8499 3461 2404 1617 1447 135 164 47 15 8 0 0 0 83.6 
GA 3037 3927 5529 5853 4908 3003 75 9 7 2 0 0 0 0 79.4 
FRI 3206 5796 5197 3682 2873 290 220 1193 39 6 9 0 0 0 78.3 
ETH 1288 3543 4907 6304 5893 3681 1312 168 16 0 0 0 0 0 7.3 
IFT 70 556 2437 5462 6423 707 4175 86 9 3 5 0 0 0 67.5 
AY 240 3007 5710 3185 02814 245 320 302 200 879 35 4 u 0 66.7 
TES 164 1466 2743 3819 575 576 5 237 32 32 3 0 0 0 66.2 
TT 1 148 662 1665 3394 5692 7891 5533 1568 406 142 20 0 0 56.8 
HAD 0 82 556 1089 1182 1549 2702 4002 3683 3338 2653 2161 2776 1159 37.4 
GRO 0 0 114 9% 240 813 2566 5466 6377 5788 4244 1459 6 0 37.2 
nD 5 40 1% 215 520 689 2160 4333 6582 6123 3789 166 627 99 36.8 
BRE 0 0 45 174 O89 778 2048 5509 7193 4864 3160 1529 105 538 36.1 
RAN 0 0 1 12 °2 7% 205 781 264 7396 9899 4463 1524 199 25.4 
DOD 0 0 3 260 506 738 129 2465 3609 3637 3753 3068 2483 5319 3.3 
TAT 3 14 3 30 69 157 35 84 1775 2678 4910 7389 4870 4067 18.1 
YD 0 0 0 0 0 0 2 404 1341 2076 2709 6757 10375 3450 14.1 
me 0 0 0 0 1 5 14 40 141 522 1786 7074 8124 9428 87 
ce 0 0 0 0 0 0 0 71 382 799 1376 3158 6858 14488 6.5 
Table A3.14 — 11,628 Appearances in 15,504 Sub-Tournaments 

Involving 15 Strategies 

1 2 3 4 5 6 7 8 9 0 hh 2 B 14 15 ER 
MAC 8396 1976 714 23 6 7% 4 5 0 0 0 0 0 0 0 96.6 
SH B72 272 270 1677 ” 79 437 190 12 0 0 0 0 0 0 05.4 
MAE 2324 3998 15 1152 681 570 57 474s 1 0 0 0 0 843 
Cia 1081 1573 2300 2588 2233 1409 40 3 1 0 0 0 0 0 0 80.0 
FRI 923 2307 2606 1691 1218 104 930 677 186 37 8 0 0 0 0 78.6 
EM 345 1308 1894 2713 2650 1823 765 124 6 0 0 0 0 0 0 76.4 
TT 7 101 626 1899 2646 3129 242 693 56 12 7 0 0 0 0 67.2 
HEV 37 868 2364 1417 1250 1078 1242 1508 1156 45 218 % 9 0 0 66.7 
TS DB 374 859 1320 2093 2487 262 1629 268 13 0 0 0 0 0 65.7 
TT 0 12 12 47 987 2020 3330 3245 #120 15 79 13 0 0 0 56.6 
mo 0 0 0 5 3 1417 472 1756 2829 2818 1889 1380 331 0 0 37.4 
KD 0 6 108 261 379 468 988 1506 1807 1555 1448 996 768 1021 317 37.3 
BE 0 0 9 34 4 169 44 1660 3047 2682 174 %0 46 26 130 37.1 
Mm 90 6 15 3 £89 196 493 1209 2618 3182 2116 1063 433 156 19 36.8 
RW 0 0 0 1 2 8 2 7 37 1307 3758 4181 1509 351 3 249 
DD 0 0 0 7 103 17% 278 623 1136 1578 1473 1812 1288 1204 1951 24.0 
ar 0 t 4 3 2 % 4 1% 277 860 1422 2402 2971 2011 1465 17.5 
Wp 0 0 0 0 0 0 0 2 22 576 841 1352 3267 4260 1096 13.8 
m 0 0 0 0 0 0 0 2 9 20 20 800 3228 3103 4261 8.0 
ær q 0 0 0 0 0 | 0 2 16 304 520 127 34 623 5.8 


Table A3.15 — 3,876 Appearances in 4,845 Sub-Tournaments 
Involving 16 Strategies 


1 2 3 4 5 6 7 8 9 0 N R B 4 B 
MAC 3053 552 197 -45 GB 10 4 0 0 0 0 0 0 0 0 
SAU 672 1076 821 634 299 19 130 42 3 0 0 0 0 0 0 
ME 597 1488 509 417 248 179 168 159 100 1 0 0 0 0 0 
Ga 287 478 71 878 808 516 146 2 0 0 0 0 0 0 0 
FRI 176 672 1012 63 48 35 28 53 4 10 0 0 0 0 9 
EM 58 3 566 8% 92 69 344 67 0 0 0 0 0 09 % 
IFT 0 9 9 460 783 1077 1006 42 34 6 0 1 0 0 0 
HEY 3 164 714 485 443 400 360 489 44 23 64 16 #2 0 9 
IE 1 64 179 324 594 816 936 772 182 8 0 0 0 0 0 
IIT 0 0 8 61 192 456 105 1310 700 109 8 7 9 0 0 
BRE 0 0 1 4 2 2 84 298 868 1036 747 414 2 % 31 
GRO 0 0 0 0 0 6 3 262 791 1097 880 46 316 57 0 
MD 0 0 1 224 9 108 192 449 618 588 509 477 21 241 246 
my 0 0 0 4 5 26 84 218 593 1012 1033 603 207 70 2 
RAN 0 0 0 0 0 0 5 5 3% 158 648 1313 142 M 5 
Do 0 0 0 0 1 14 40 89 22 401 49 702 541 389 288 
TAT 0 0 0 0 0 3 3 WU %2 7 274 474 1101 863 677 
nD 0 0 0 6 0 0 8 0 1418 100 143 259 397 1403 1333 
Me 0 0 0 9 0 0 0 0 0 0 7 47 243 99 102 
ae 0 0 0 0 0 0 0 0 0 5 3 99 167 a 1169 
Table A3.16 — 969 Appearances in 1,140 Sub-Tournaments 

Involving 17 Strategies 

1 2 3 4 5 6 7 8 9 10 HW R RBR M 3 
MC 88 % 3 2 2 1 0 0 0 0 0 0 0 0 0 
SHU 1 27 24 17 " V 2 9 0 0 0 0 0 0 0 
Æ 101 429 121 #1139 82 83 6 A 0 0 0 0 0 0 
GA 49 #107 174 2277 #222 153 2 5 0 0 0 0 0 0 0 
FI 14 14 290 1670 16 2 5 60 39 0 0 0 0 0/09 
EM 5 5 124 209 %2 19 106 7 0 0 0 0 0 0 0 
HEV 0 16 149 123 115 106 102 106 143 8 18 3 0 0 0 
IFT 0 0 8 66% 157 27 W% 15 18 1 0 0 0 0 0 
TS 0 6 2 5 109 1799 5 5% 92 0 1 0 0 6 20 
HT 0 0 0 2 2 50 200 5 21 4 2 1 0 0 0 
BBE 0 0 0 0 0 2 6 3 140 26 22 74 À R 9 
6x0 0 0 0 9 0 0 0 13 166 260 28 160 8 38 7 
my 0 0 0 0 0 1 #4 N 6 2 28 27 %3 2 ” 
MD 0 0 0 0 0 18 30 61 134 184 114 1422 10 V B® 
RAN 0 0 0 0 0 0 0 0 1 5 4 BS 45 2237 3 
DoD 0 0 0 0 0 0 0 7 BW 60 16 18 14 168 73 
Jar 0 0 0 0 0 0 0 0 0 1 6 3% 11 438 169 
m 0 0 0 0 0 0 0 0 0 4 0 3 6 78 446 
IX 0 0 0 0 0 0 0 0 0 0 0 9 1 2 260 
ae 0 0 0 0 0 0 0 0 0 0 0 3 5 6 %5 
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Appendix Four — Sample Programs 


This appendix contains fifteen sample programs from the inter- 
active tournament. All programs are listed in GW-BASIC. These list- 
ings were saved as ASCII files, and imported into WordPerfect. To 
accelerate actual data processing, the longer programs were compiled 
in TURBO BASIC and saved as DOS-executable files. 

All programs in this appendix are documented according to GW- 
BASIC syntax. Remarks are therefore inserted in one of two permis- 
sible ways: either following an "REM:" statement on a separately- 
enumerated line, or following a single quotation mark, "' ", after 
an executable instruction. 

The first ten programs consist of games between different 
strategic pairs. Thus, the algorithms of all twenty strategies are 
represented. Each program is named according to the particular 
strategies in competition; e.g. "TFTVDDD' (tit-for-tat versus the 
strategy of pure defection). In all games, the strategies' moves are 
represented in binary form; that is, "0" means a defection; "1", a 
co-operation. 

The eleventh sample program, COMTOU, generates the data used in 
the combinatoric analyses of Chapter Eight. It accepts input of the 
number of strategies to be involved in a given set of sub-tourna- 
ments, then outputs the efficiency table for that set. 

The twelfth sample program, ECOSYS, generates the data for the 
first ecosystemic competition of Chapter Nine, involving all twenty 
strategies. 

The remaining three sample programs are used for analyzing 
various aspects of the maximization strategies’ intra-familial 
performance. 

TESTMAT accepts input of any values of (W,X,Y,Z) (such that W + 
X+ Y+Z = 100) for the initial probabilistic event matrix, and 
outputs the subsequently determined event matrices and their as- 
sociated expected utilities and scores, at one hundred move inter- 
vals, for the remainder of the game. 

MAXVMAX accepts input of initial co-operative weightings and 
the number of games to be run, and outputs the resulting average 
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scores and distribution of scores. 

MAXrMAX accepts input of Wand Z values for the probabilistic 
event matrix, and the input of the domain of the difference between X 
and Y, over which the corresponding range of initial event matrices 
is generated. MAXrMAX outputs the (W,X,Y,2} values for each initial 
event matrix in the selected range, the final score resulting from 
each event matrix in the range, and the number of moves required for 
the onset of perpetual mutual co-operation (if it occurs). 


Program A4.1 — TFTVDDD 


100 DIM DDD(1000) "ARRAY OF DDD's MOVES 

110 DIM TFT(1000) "ARRAY OF TFT's MOVES 

120 TFT(1) = 1 ` 'TFT CO-OPERATES ON MOVE 1 

130 RANDOMIZE TIMER ' SEEDS PSEUDO-RANDOM GENERATOR 

140 FOR I = 1 TO 1000 'GAME OF 1000 MOVES 

150 IF RND(I) < 1 THEN DDD(I)= 0 ELSE DDD(I)= 1 

155 REM: LINE 150: DDD DEFECTS WITH PROBABILITY OF UNITY 

160 IF I<=999 THEN TFT(I+1) = DDD(I) 'TFT's DECISION RULE 
165 REM: LINES 170-200 ASSIGN PAYOFFS TO OUTCOMES 

170 IF DDD(1)=1 AND TFT(I)=1 THEN R=R+3: T=T+3 

180 IF DDD(I)=1 AND TFT(I)=0 THEN T=T+5 

190 IF TFT(I)=1 AND DDD(I)=0 THEN R=R+5 

200 IF DDD(I)=0 AND TFT(I)=0 THEN R=R+1: T=T+1 

210 IF I MOD 100=0 THEN PRINT I;R;T 'DISPLAY SCORE AT 100 MOVE 
INTERVALS 

220 NEXT I ' NEXT MOVE 

230 PRINT "DDD's score is" R ‘PRINT FINAL SCORES 

240 PRINT "TFT's score is" T 


Program A4.2 — CCCVTAT 


100 DIM CCC(1000) 

110 DIM TAT(1000) 

120 FOR I = 1 TO 1000 

130 TAT(1) = 0 "TAT DEFECTS ON MOVE 1 

140 RANDOMIZE TIMER 

150 IF RND(I) < O THEN CCC(I)= O ELSE CCC(I)= 1 
165 REM: CCC CO-OPERATES WITH PROBABILITY OF UNITY 
160 IF I<1000 THEN IF CCC(I) = O THEN TAT(I+1) = 1 ELSE TAT(I+1) = 0 
165 REM: LINE 160: TAT's DECISION RULE 

170 IF CCC(I)=1 AND TAT(I)=1 THEN R=R+3: T=T+3 

180 IF CCC(I)=1 AND TAT(I)=0 THEN T=T+5 

190 IF TAT(I)=1 AND CCC(I)=0 THEN R=R+5 

200 IF CCC(I)=0 AND TAT(I)=0 THEN R=R+1: T=T+1 

210 IF I MOD 100=0 THEN PRINT L;R;T 

220 NEXT I 

230 PRINT "CCC's score is" R 

240 PRINT "TAT's score is" T 
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Program A4.3 — HXVIIT 


100 DIM TQC(1000) 

110 DIM TIT(1000) 

120 RANDOMIZE TIMER 

130 FOR J=1 -TO 1000 

140 IF RND(J) < .75 THEN TQC(J) = 1 ELSE TOC(J) = 0 

145 REM: LINE 140: TOC CO-OPERATES RANDOMLY WITH PROBABILITY 3/4 
150 NEXT J 

160 TIT(1) = 1 ‘TTT CO-OPERATES ON MOVES 1 AND 2 

170 TIT(2) = 1 

180 FOR I=1 TO 1000 

190 IF I>2 THEN IF TRC(I-2)=0 AND TQC(I-1)=0 THEN TIT(I)=0 ELSE 
TIT (1I)=1 

195 REM: LINE 190: TIT's DECISION RULE 

200 IF TOC(I)=1 AND TIT(I)=1 THEN R=R+3: T=T+3 

210 IF TQC(I)=1 AND TIT(I)=0 THEN T=T+5 

220 IF TIT(I)=1 AND TQC(I)=0 THEN R=R+5 

230 IF TQC(I)=0 AND TIT(I)=0 THEN R=R+1: T=T+ 

240 NEXT I ` | 

250 PRINT "TQC's score is" R 

260 PRINT "TIT's score is" T 


Program A4.4 — TODVCHA 


100 DIM TQD(1000) 

110 DIM CHA(1000) 

120 FOR I = 1 TO 1000 

130 IF I<=10 THEN CHA(I)=1 ‘CHA CO-OPERATES ON FIRST 10 MOVES 

140 RANDOMIZE TIMER 

150 IF RND(I) < .75 THEN TØD(I)= O ELSE TOD(I)= 1 

155 REM: LINE 150: TQD DEFECTS RANDOMLY WITH PROBABILITY 3/4 

160 IF I>10 AND I<=25 THEN IF TØD(I-1)=1 THEN CHA(I)=1 ELSE CHA(I)=0 
165 REM: CHA PLAYS TIT-FOR-TAT BETWEEN MOVES 11 AND 25 

170 IF TQD(I)=1 THEN C=C+1 ELSE D=D+1 

175 REM: LINE 170: CHA INCREMENTS THE NUMBER OF TOD's CO-OPERATIONS 
OR DEFECTIONS 

180 IF 1>25 THEN RANDOMIZE TIMER 

190 IF 1>25 THEN IF TOD(I-1)=0 AND C/(C+D)<.6 AND RND(J) X/(C+D) THEN 
CHA(I)=0 ELSE CHA(I)=1 'CHA'S DECISION RULE 

200 IF TQD(I)=1 AND CHA(I)=1 THEN R=R+3: T=T+3 

210 IF TOD(I)=1 AND CHA(I)=0 THEN T=T+5 

220 IF CHA(I)=1 AND TQD(I)=0 THEN R=R+5 

230 IF TOD(I)=0 AND CHA(I)=0 THEN R=R+1: T=T+1 

240 IF I MOD 100=0 THEN PRINT I;R;T 

250 NEXT I 

260 PRINT "TQD's score is" R 

270 PRINT "CHA's score is" T 
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Program A4.5 — RANVTES 


100 DIM RAN(1001) 
110 DIM TES(1001) 


120 Z=0: W=0 ‘FLAGS WHICH, WHEN SET, DETERMINE TES's DECISION 
PATH Gå 
130 TES(1)=0 "TES DEFECTS ON MOVE 1 


140 RANDOMIZE TIMER 

150 FOR I = 1 TO 1000 

160 IF RND(I)X .5 THEN RAN(I)=0 ELSE RAN(I)=1 

165 REM: LINE 160: RAN CO-OPERATES OR DEFECTS RANDOMLY WITH PROBABI- 
LITY 1/2 

170 IF I=1 THEN GOTO 230 "ASSIGN PAYOFFS TO OUTCOME OF MOVE 1 
180 IF W=1 THEN GOSUB 390 ‘IF W FLAG IS SET THEN TES PLAYS TIT-FOR- 
TAT 

190 IF RAN(I-1)=0 THEN Z=1 'IF RAN DEFECTS THEN Z FLAG IS SET 

200 IF Z=1 THEN GOSUB 320 ‘IF Z FLAG IS SET THEN TES APOLOGIZES AND 
SETS W FLAG 

210 IF RAN(I-1)=1 THEN Z=2 'TES's DECISION RULE UNTIL RAN DEFECTS 
220 IF Z=2 THEN GOSUB 350. 

230 IF RAN(I)=1 AND TES(I)=1 THEN R=R+3: T=T+3 

240 IF RAN(I)=1 AND TES(I)=0 THEN T=T+5 

250 IF TES(I)=1 AND RAN(I)=0 THEN R=R+5 

260 IF RAN(I)=0 AND TES(I)=0 THEN R=R+1: T=T+1 

270 IF I MOD 100=0 THEN PRINT I;R;T 

280 NEXT I 

290 PRINT "RAN's score is" R 

300 PRINT "TES's score is" T 

310 END 

320 TES(I)=1 

330 W=1 

340 RETURN 230 

350 IF I=2 THEN TES(I)=1 

360 IF I=3 THEN TES(I)=1 

370 IF 123 THEN IF I MOD 2=0 THEN TES(I)=0 ELSE TES(I)=1 

380 RETURN 230 

390 IF RAN(I-1)=0 THEN TES(I)=0 ELSE TES(I)=1 

400 RETURN 230 


Program A4.6 — SHUVBBE 


100 DIM BBE(3000) 

110 DIM SHU(3000) 

115 REM: EXPANDED SCORE ARRAY SAFELY ACCOMMODATES SHU's INCREMENTING 
RETALIATORY DEFECTIONS 

120 SHU(1)=1 "BOTH SHU AND BBE CO-OPERATE ON MOVE 1 

130 BBE(1)=1 

140 RANDOMIZE TIMER 

150 FOR I=1 TO 1000+0 

155 REM: LINE 150: Q IS THE NUMBER OF MOVES ADDED DUE TO SHU's 
INCREMENTING DEFECTIONS. HOWEVER, THE GAME SCORE IS COMPUTED FROM THE 
FIRST 1000 MOVES. 

160 IF I=1 THEN GOTO 260 'FIRST JOINT OUTCOME IS RECORDED. MAKE NEXT 
MOVES. 

170 IF SHU(I-1)=1 AND RND(I)<.9 THEN BBE(I)=1 ELSE BBE(I)=0 

175 REM: LINE 170: BBE's DECISION RULE 

180 IF BBE(I-1)=1 THEN SHU(I)=1 'SHU's DECISION RULE IF BBE CO- 
OPERATES 

190 IF BBE(I-1)=0 THEN Q=Q+1 ELSE GOTO 260 

195 REM: LINE 190: SHU's DECISION RULE IF BBE DEFECTS 

200 FOR K=I TO I+Q-1 'LINES 200-220: SHU's RETALIATORY DEFECTION (S) 
210 SHU(K)=0 

220 NEXT K 

230 SHU(I+Q)=1 

240 GOSUB 370 'BBE's RESPONSE TO SHU's RETALIATORY DEFECTIONS 

250 I=I+Q 

255 REM: LINE 250: GAME MOVE ADJUSTED TO ACCOMMODATE RETALIATORY 
DEFECTIONS. SHU RESUMES PLAYING TIT-FOR-TAT AT MOVE I+Q 

260 NEXT I 

270 FOR I=1 TO 1000 ' ASSIGN PAYOFFS TO FIRST 1000 OUTCOMES 

280 IF BBE(I)=1 AND SHU(I)=1 THEN R=R+3: T=T+3 

290 IF BRE(I)=1 AND SHU(I)=0 THEN T=T+5 

300 IF SHU(I)=1 AND BBE(I)=0 THEN R=R+5 

310 IF BBE(I)=0 AND SHU(I)=0 THEN R=R+1: T=T+1 

320 IF I MOD 100=0 THEN PRINT I;R;T 

330 NEXT I 

340 PRINT "BBE's score is" R 

350 PRINT "SHU's score is" T 

360 END 

370 FOR J=I+1 TO I+Q 

380 IF SHU(J-1)=1 AND RND(I)<.9 THEN BBE(J)=1 ELSE BBE(J)=0 

390 NEXT J 

400 RETURN 
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Program A4.7 — ETHvMEU 


100 DIM ETH(1000) 

110 DIM MEU(1001) 

120 RANDOMIZE TIMER 

130 FOR J=1 TO 100 

140 IF RND(J)<.5 THEN MEU(J)=0 ELSE MEU(J)=1 

150 NEXT J 

155 REM: LINES 130-150: MEU MAKES FIRST 100 RANDOM MOVES, CO-OPERAT- 
ING OR DEFECTING WITH EQUAL PROBABILITY 

160 EIH(1)=1  'ETH CO-OPERATES ON FIRST MOVE 

170 RANDOMIZE TIMER 

180 FOR I=1 TO 1000 ‘MOVES OF THE GAME 

190 IF MEU(I)=1 THEN X=X+1 ELSE Y=Y+1 

195 REM: LINE 190: ETH UPDATES NUMBER OF MEU's CO-OPERATIONS AND 
DEFECTIONS 

200 IF I>1 THEN IF MEU(I-1)=0 AND RND(I)<= (Y/(X+Y)) THEN ETH(I)=0 
ELSE ETH(I)=1 '‘ETH's DECISION RULE 

210 IF MEU(I)=1 AND ETH(I)=1 THEN C=C+1 ‘MEU UPDATES EVENT MATRIX 
220 IF MEU(I)=1 AND ETH(I)=0 THEN D=D+1 ‘MEU UPDATES EVENT MATRIX 
230 IF MEU(I)=0 AND ETH(I)=1 THEN E*E+1 'MEU UPDATES EVENT MATRIX 
240 IF MEU(I)=0 AND ETH(I)=0 THEN F=F+1 ‘MEU UPDATES EVENT MATRIX 
250 IF I>100 THEN UC=3*C/ (C+D) "MEU FINDS EXPECTED UTILITY OF CO- 
OPERATION 

260 IF 1»=100 THEN UD=(5*E+F)/(E+F) ‘MEU FINDS EXPECTED UTILITY OF 
DEFECTION 

270 IF 1»=100 THEN IF UC>XUD THEN MEU(I+1)=1 ELSE MEU(I+1)=0 'MEU's 
DECISION RULE 

275 REM: LINES 280-310 UPDATE GAME SCORES 

280 IF MEU(I)=1 AND ETH(I)=1 THEN M=M+3: 5=5+3 

290 IF MEU(I)=1 AND EIH(I)=0 THEN 5=5+5 

300 IF MEU(I)=0 AND ETH(I)=1 THEN M=M+5 

310 IF MEU(I)=0 AND EIH(I)=0 THEN M=M+1: 5=5+1 

320 IF I MOD 100=0 THEN GOSUB 350 'PRINTS OUTCOMES, UTILITIES AND 
SCORES AT 100-MOVE INTERVALS 

330 NEXT I 

340 END 

350 PRINT C+D+E+F "MOVES" 

360 PRINT C,D,UC 

370 PRINT E,F,UD 

380 PRINT "ETH's SCORE IS" 5 

390 PRINT "MEU's SCORE IS" M 

400 PRINT 

410 RETURN 330 
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Program A4.8 — GROVMAD 


100 DIM GRO(1000) 

110 DIM MAD(1001) 

120 RANDOMIZE TIMER 

130 FOR J=1 TO 100 

140 IF RND(J)<.9 THEN MAD(J)=0 ELSE MAD(J)=1 

150 NEXT J 

155 REM: LINES 130-150: MAD MAKES 100 RANDOM MOVES, DEFECTING WITH 
PROBABILITY 9/10 

160 GRO(1)=1 ‘GRO CO-OPERATES ON MOVE 1 

170 FOR I=1 TO 1000 

180 IF I>1 THEN IF MAD(I-1)=GRO(I-1) THEN Q=1 ELSE Q=2 'Q IS GRO's 
DECISION FLAG 

190 IF Q=1 THEN GRO(I)=1 ‘GRO CO-OPERATES FOLLOWING SYMMETRIC 
OUTCOME 

200 IF Q=2 THEN RANDOMIZE TIMER 

210 IF Q=2 THEN IF RND(I)<=(2/7) THEN GRO(I)=1 ELSE GRO(I)=0 ‘GRO 
CO-OPERATES WITH PROBABILITY 2/7 FOLLOWING ASYMMETRIC OUTCOME 

220 IF MAD(I)=1 AND GRO(I)=1 THEN C=C+1 ‘MAD UPDATES EVENT MATRIX 
230 IF MAD(I)=1 AND GRO(I)=0 THEN D=D+1 'MAD UPDATES EVENT MATRIX 
240 IF MAD(I)=0 AND GRO(I)=1 THEN E=E+1 ‘MAD UPDATES EVENT MATRIX 
250 IF MAD(I)=0 AND GRO(I)=0 THEN F=F+1 ‘MAD UPDATES EVENT MATRIX 
260 IF I>=100 THEN UC=3*C/(C+D) ‘MAD FINDS EXPECTED UTILITY OF CO- 
OPERATION | 

270 IF 1»=100 THEN UD=(S*E+F)/(E+F) 'MAD FINDS EXPECTED UTILITY OF 
DEFECTION 

280 IF I»=100 THEN IF UC>=UD THEN MAD(I+1)=1 ELSE MAD(I+1)=0 'MAD'S 
DECISION RULE 

290 IF MAD(I)=1 AND GRO(I)=1 THEN M=M+3: 5=5+3 

300 IF MAD(I)=1 AND GRO(I)=0 THEN 5=5+5 

-310 IF MAD(I)=0 AND GRO(I)=1 THEN M=M+5 

320 IF MAD(T)=0 AND GRO(I)=0 THEN M=M+1: 5=5+1 

330 IF I MOD 100=0 THEN GOSUB 360 

340 NEXT I 

350 END 

360 PRINT C+D+E+F "MOVES" 

370 PRINT C,D,UC 

380 PRINT E,F,UD 

390 PRINT "GRO's SCORE IS" S 

400 PRINT "MAD's SCORE IS" M 

410 PRINT 

420 RETURN 340 
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Program A4.9 — FRIVMAE 


100 DIM FRI(1000) 

110 DIM MAE(1001) 

120 RANDOMIZE TIMER 

130 FOR J=1 TO 100 

140 IF RND(J)<.28 THEN MAE(J)=0 ELSE MAE(J)=1 

150 NEXT J 

155 REM: LINES 130-150: MAE MAKES FIRST 100 RANDOM MOVES, CO-OPERAT- 
ING WITH PROBABILITY 5/7 

160 FRI(1)=1 ‘FRI CO-OPERATES ON MOVE 1 

170 FOR I=1 TO 1000 

180 IF Z=1 THEN GOTO 200 'Z IS A FLAG THAT IS SET IF FRI ENTERS THE 
DECISION PATH OF PERPETUAL DEFECTION 

190 IF I>1 THEN IF MAE(I-1)=1 THEN FRI(I)=1 ELSE GOSUB 410 

195 REM: LINE 190: FRI CO-OPERATES IF MAE CO-OPERATED ON PREVIOUS 
MOVE. IF MAE DEFECTED, THEN FRI ENTERS PATH OF PERPETUAL DEFECTION. 
200 IF MAE(I)=1 AND FRI(I)=1 THEN C=C+1 

210 IF MAE(I)=1 AND FRI(I)=0 THEN D=D+1 

220 IF MAE(I)=0 AND FRI(I)=1 THEN E=E+1 

230 IF MAE(I)=0 AND FRI(I)=0 THEN F=F+1 

240 IF 1>=100 THEN UC=3*C/ (C+D) 

250 IF I>=100 THEN UD=(5"*E+F)/(E+F) 

260 IF 1>=100 THEN IF UC>=UD THEN MAE(I+1)=1 ELSE MAE(I+1)=0 ‘MAE's 
DECISION RULE | 

270 IF MAE(I)=1 AND FRI(I)=1 THEN M=M+3: S=5+3 

280 IF MAE(I)=1 AND FRI(I)=0 THEN 5=5+5 

290 IF MAE(I)=0 AND FRI(I)=1 THEN M=M+5 

300 IF MAE(I)=O AND FRI(I)=0 THEN M=M+1: 5=5+1 

310 IF I MOD 100=0 THEN GOSUB 340 

320 NEXT I 

330 END 

340 PRINT C+D+E+F "MOVES" 

350 PRINT C,D,UC 

360 PRINT E,F,UD 

370 PRINT "FRI's SCORE IS" 5 

380 PRINT "MAE's SCOPE IS" M 

390 PRINT 

400 RETURN 320 

410 Z=1 

420 FOR K=I TO 1000 

430 FRI (K)=0 

440 NEXT K 

450 RETURN 200 
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Program A4.10 — NYDVMAC 


100 DATA 1,6,7,17,22,23,26,29,30, 31,33, 38, 39,45,49,54,55,58, 61 

105 REM: LINE 100: THESE ARE THE CRITICAL SUMS OF WEIGHTED VALUES FOR 
THREE CONSECUTIVE MOVES WHICH ELICIT NYD's DEFECTION, ACCORDING TO 
ITS DECISION RULE 

110 DIM NYD(1000) 

120 DIM MAC(1001) 

130 DIM X(19) 

140 FOR K=1 TO 19 'LINES 140-160 READ CRITICAL SUMS INTO ARRAY X 

150 READ X(K) 

160 NEXT K 

170 RANDOMIZE TIMER 

180 FOR J=1 TO 100 

190 IF RND(J)<.1. THEN MAC(J)=0 ELSE MAC(J)=1 

200 NEXT J 

205 REM: LINES 180-200: MAC MAKES FIRST 100 RANDOM MOVES, CO-OPERAT- 
ING WITH PROBABILITY 9/10 

210 NYD(1)=1 'NYD CO-OPERATES ON MOVE 1 

220 FOR I=1 TO 1000 ‘GAME BEGINS 

230 IF I=i THEN GOTO 390 'MAC UPDATES EVENT MATRIX AFTER FIRST 
OUTCOME 

240 IF I=2 THEN IF MAC(I-1)=1 THEN NYD(I)=1 ELSE NYD(I)=0 

250 IF I=3 THEN IF (MAC(1)<>NYD(1) AND MAC(2)<>NYD(2)) OR MAC(2)=0 
THEN NYD(3)=0 ELSE NYD(3)=1 

255 REM: LINES 240-250: NYD's DECISION RULE FOR MOVES 2 AND 3 

260 IF I<4 THEN GOTO 390 'MAC UPDATES EVENT MATRIX AFTER SECOND AND 
THIRD OUTCOMES 

270 FOR L=1 TO 3 ‘LINES 270-330: NYD FINDS SUM OF WEIGHTED VALUES 
FOR THREE PREVIOUS CONSECUTIVE MOVES 

280 IF MAC(I-L)=0 THEN P=2 ELSE P=0 

290 IF NYD(I-L)=0 THEN Q=1 ELSE Q=0 

300 IF L=3 THEN SUM=SUM+16* (P+Q) 

310 IF L=2 THEN SUM=SUM+4" (P+Q) 

320 IF L=1 THEN SUM=SUM+P+) 

330 NEXT L 

340 FOR N=1 TO 19 ‘LINES 340-350: NYD DECIDES WHETHER SUM IS CRITI- 
CAL 

350 IF SUM=X(M) THEN GOSUB 600 "IF SUM IS CRITICAL THEN NYD CEFECTS 

360 NEXT N 

370 NYD(I)=1 ‘IF SUM IS NOT CRITICAL THEN NYD CO-OPERATES 

380 SUM=0: P=0: Q=0 ‘RESETS SUM AND VALUE COUNTERS 

390 IF MAC(I)=1 AND NYD(I)=1 THEN C=C+1 

400 IF MAC(I)=1 AND NYD(I)=0 THEN D=D+1 

410 IF MAC(I)=0 AND NYD(I)=1 THEN E=E+1 

420 IF MAC(I)=0 AND NYD(I)=0 THEN F=F+1 

430 IF [>=100 THEN UC=3*C/ (C+D) 

440 IF 1»=100 THEN UD=(5*E+F)/(EtF) 

450 IF 1>=100 THEN IF UC>UD THEN MAC(I+1)=1 ELSE MAC(I+1)=0 ‘MAC's 
DECISION RULE 

460 IF MAC(I)=1 AND NYD(I)=1 THEN M=M+3: 5=S+3 

470 IF MAC(I)=1 AND NYD(I)=0 THEN 5=5+5 

480 IF MAC(I)=0 AND NYD(I)=1 THEN M=M+5 

490 IF MAC(I)=0 AND NYD(I)=0 THEN M=M+1: 5=5+1 

500 IF I MOD 100=0 THEN GOSUB 530 


510 
520 
530 
540 
550 
560 
570 
580 
590 
600 
610 


NEXT I 

END 

PRINT C+D+E+F "MOVES" 
PRINT C.D,UC 

PRINT E,F,UD 

PRINT "NYD's SCORE IS" 5 
PRINT "MAC's SCORE IS" M 
PRINT 

RETURN 510 

NYD(T)=0 

RETURN 380 
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281 
Program A4.11 — COMTOU and Chain Merge File 22 


5 REM: COMTOU COMPUTES EFFICIENCY TABLES FOR COMBINATORIC SUB-TOURNA- 
MENTS INVOLVING Z COMPETITORS, WHERE Z CAN RANGE FROM 2 TO 19 


7 REM: DATA STA (LINES 10-105) ARE ROWS FROM TABLE OF RAW 
SCORES - Š 
1 0 D A T A 


1000, 1952 , 2992 , 3996, 5000, 1004, 1008, 1004, 1176 ,4996 , 1212 , 1024, 1272 , 1380 
, 3664, 3292, 1040, 1004, 1004, 1004 

1 5 D A T A 
762,1727 , 2634, 3550, 4470, 1673, 2324, 1520 , 948, 3580 , 920 , 777 , 1025 , 1098 ,448 
4,3023, 2426, 2405, 748,1693 

2 O D A T A 
502, 1354, 2243, 3139, 3972 , 2193, 3129 , 2098, 713, 2299 , 659 ,529 , 743,825, 3968, 
2681 , 3200, 3076,523, 2161 

2 5 OD A T A 
251,1095,1914, 2685, 3472, 2706, 3295, 2436 ,537,1124,405 , 307 ,479,570, 3460, 
2086 , 3476, 3280, 248, 2721 

3 O D A T A 
0,795, 1542, 2292, 3000, 3000, 3000, 2700, 3000,0, 120,30, 243, 264, 3000,3000,3 
000, 3000, 3000, 1500 

3 5 D A T A 
999, 1673, 2193, 2701 , 3000, 3000, 3000, 1036, 3000, 2250, 1108, 1019, 1267, 2965, 
3000, 3000, 3000, 3000, 3000, 2999 

4 O D A T A 
998, 1444, 1874, 2365, 3000 , 3000, 3000, 2662, 3000, 1800, 1143, 1002, 1204, 2935, 
3000 , 3000 , 3000 , 3000 , 3000 , 1500 

4 SD A T A 
999, 1690, 2433, 2766, 3200, 1041, 3197, 1033, 1174, 2367, 1148, 1018, 2989, 3140, 
3242 , 2735 , 3215 , 3225, 1027, 1049 

5 0 D A T A 
956, 1878, 2913, 3877 , 3000, 3000 , 3000 , 974, 3000 , 4529 , 1125, 1008, 1263, 1322,3 
000 , 3000 , 3000 , 3000, 3000, 2999 

5 5 D A T A 
1,1055, 2219, 3534, 5000, 2250, 2800, 2132, 384, 2000, 230 , 42 , 368 , 466 , 4984, 326 
6,2837 , 2851, 6,2251 

6 O D A T A 
947 , 1995 , 2899 , 3940 , 4920 , 1113, 1538, 1133, 1180 , 4750 , 2384, 1003, 2396, 1887, 
4875 , 3241, 2610, 2294,955,1175 

6 5 D A T A 
994, 2037 , 3004, 3912 , 4980 , 1024, 1147 , 1013, 1168, 4972 ,1181,1029, 1266, 1332, 
4940 , 3273,1193,1243,996,1013 

7 O D A T A 
932, 1970, 2903, 3899 , 4838, 1272 , 1624, 2544, 1183,4628, 2356 , 987 , 2594, 2123,4 
852,3232, 2870, 3000 , 939 , 1312 

7 5 D A T A 
905,1878, 2900, 3750 , 4824, 2965, 2995, 2665, 1237 , 4556 , 1741 , 971 , 1849 , 1807 ,4 
814, 3270, 3000, 2893, 955 , 2926 

8 0 D A T A 
334, 764, 1543, 2305, 3000, 3000, 3000, 2637 , 3000, 14,165, 45,217,279, 3000, 300 
0, 3000, 3000, 3000, 2500 

8 5 D A T A 
427 , 1233, 2111, 2721, 3000, 3000, 3000, 2385, 3000, 1156 , 586 , 468,672,670, 3000 
, 3000, 3000, 3000, 3000, 2995 
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9 O D A T A 
990, 1541 , 2010, 2281 , 3000, 3000, 3000, 2660 , 3000 , 1697 , 2140 ,988, 2215, 2210,3 
000, 3000, 3000, 3000, 3000, 2987 

9 5 D A T A 
999, 1475 ,1875,, 2445, 3000, 3000, 3000, 2640, 3000, 1681 , 1614, 988, 2263, 2505,3 
000, 3000, 3000, 3000, 3000, 2999 

1 0 O D A T A 
999, 2023, 2933, 4033, 3000, 3000, 3000, 1027, 3000, 4991, 1195, 1031 , 1259 , 1325, 
3000, 3000, 3000 , 3000, 3000, 1007 

1 0 5 D A T A 
999, 1688, 2156, 2731 , 4000 , 2999 , 4000 , 1044, 2999, 2246 , 1175, 1008, 1307 , 2951, 
2500, 2995 , 3007 , 2999 , 1002, 2998 

1 0 7 D A T A 
DDD, TOD, RAN, TOC, CCC, TFT, TIT, BBE, SHU, TAT, MEU, MAD, MAE, MAC, NYD,GRO,CHA,E 
TH, FRI,TES 

109 INPUT "ENTER THE NUMBER OF COMPETITORS (FROM 2-19)";Z 

110 DIM RAW(20,20) 'ARRAY OF RAW SCORES FOR 20 VERSUS 20 STRATHGIES 
120 DIM SCO(2,20) ‘ARRAY THAT ASSOCIATES Å STRATHGY'S SUB-TOURNAMENT 
SCORE WITH ITS ACRONYM 

125 DIM GAM(Z) ‘ARRAY THAT ENUMERATES RAW SCORE ENTRIES USED IN 
SUB-TOURNAMENTS 

130 DIM NA$(20) 'STRING ARRAY OF ACRONYMS 

140 DIM PER(20) 'ARRAY OF EFFICIENCY PERCENTAGES 

150 DIM TOU(20,2+1) 'ARRAY THAT ASSOCIATES A STRATEGY'S NET APPEARAN- 
CES AND NET RANKINGS WITH ITS ACRONYM, IN SUB-TOURNAMENTS FOR A GIVEN 
Z 

155 DIM OAN(20) ‘ARRAY OF RANK NUMBERS 

160 DIM PIF(2, 20) "ARRAY THAT ASSOCIATES A STRATEGY'S ACRONYM WITH 
ITS TOTAL SCORE IN A GIVEN SUB-TOURNAMENT 

170 FOR J=1 TO 20 'LINES 170-230: READ DATA (RAW SCORES) INTO RAW 
ARRAY 

190 FOR K=1 TO 20 

200 READ RAW(J,K) 

210 NEXT K 

230 NEXT J 

240 FOR K=1 TO 20 

250 READ NA$(K) 'READS ACRONYMS INTO NA$ STRING ARRAY 

260 TOU(K,Z+1)=CVI(MID$(NA$(K),2,2)) ‘ASSOCIATES A UNIQUE INTEGER 
WITH EACH STRING ACRONYM. THUS ENCODED, ACRONYMS CAN BE ASSOCIATED 
WITH NUMERIC DATA. 

270 NEXT K 

275 REM: LINES 280-319: MERGE PROGRAM "Z*" WITH COMTOU, WHERE '"*" IS 
THE INPUT Z VALUE. THE Z* PROGRAMS SELECT ALL COMBINATIONS OF STRATE- 
GIES FOR EACH GIVEN Z VALUE. FOR EXAMPLE, SEE Z2 FOLLOWING THIS 
LISTING. 

280 IF Z=2 THEN CHAIN MERGE "B:Z2",2000,ALL 

282 IF Z=3 THEN CHAIN MERGE "B:Z3",3000, ALL 

284 IF Z=4 THEN CHAIN MERGE "B:Z4",4000,ALL 

285 IF Z=5 THEN CHAIN MERGE "B:Z5",5000,ALL 

286 IF Z=6 THEN CHAIN MERGE ''B:Z6",6000,ALL 

288 IF Z=7 THEN CHAIN MERGE "B:Z7",7000,ALL 

290 IF Z=8 THEN CHAIN MERGE "B:Z8",8000,ALL 

300 IF Z=9 THEN CHAIN MERGE "B:Z9",9000, ALL 

302 IF Z=10 THEN CHAIN MERGE "B:Z10",10000,ALL 

304 IF Z=11 THEN CHAIN MERGE "B:Z11",11000,ALL 
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306 IF Z=12 THEN CHAIN MERGE "B:Z12",12000, ALL 

308 IF Z=13 THEN CHAIN MERGE "B:Z13",13000,ALL 

310 IF Z=14 THEN CHAIN MERGE "B:Z14",14000, ALL 

312 IF Z=15 THEN CHAIN MERGE "B:Z15",15000, ALL 

314 IF Z=16 THEN CHAIN MERGE "B:Z16",16000,ALL 

316 IF Z=17 THEN CHAIN MERGE "B:217",17000,ALL 

318 IF Z=18 THEN CHAIN MERGE "B:218",18000,ALL 

319 IF Z=19 THEN CHAIN MERGE "B:Z19",19000, ALL 

320 FOR X=1 TO Z 'LINES 320-350: INCREMENT APPEARANCE NUMBERS OF 
STRATHGIES SELECTED FOR CURRENT SUB-TOURNAMENT 

340 TOU(GAM(X) ,0)=TOU(GAM(X) ,0) +1 

350 NEXT X 

360 FOR X=1 TO Z 'LINES 360-420: ADDS SCORES OF STRATEGIES COMPETING 
IN CURRENT SUB-TOURNAMENT 

370 FOR Y=1 TO Z 

380 SUM=SUM+RAW (GAM(X) ,GAM(Y) ) 

390 NEXT Y 

400 SCO(1,X)=SUM: PIF(1,X)=SUM 

410 SUM=0 

420 NEXT X 

430 FOR Y=1 TO Z "LINES 430-460: ‘ASSOCIATES SCORES WITH ENCODED 
ACRONYMS 

440 500(2,Y)=CVI (MID$ (NAS (GAM(Y)) ,2,2)) 

450 PTF(2,Y)=SCO(2,Y) 

460 NEXT Y 

470 SORT=1 "LINES 470-530: SORTS SCORES AND ASSOCIATED ACRONYMS 
ACCORDING TO ORDER OF MAGNITUDE 

480 WHILE SORT 

490 SORT=0 

500 FOR X=1 TO Z-1 

510 IF PIF(1,X)<PIF(1,X+1) THEN SWAP PTF(1,X), PTF(1,X+1): SWAP 
PIF(2,X),PTF(2,X+1): SORT=1 

520 NEXT X 

530 WEND 

610 OAN(1)=1 'LINES 610-640: ASSOCIATE RANK NUMBERS WITH STRATEGIES 
IN CURRENT SUB-TOURNAMENT, ACCORDING TO ORDERED SCORES 

620 FOR Y=2 TO Z 

630 IF PTF(1,Y)=PTF(1,Y-1) THEN OAN(Y)=OAN(Y-1) ELSE OAN(Y)=Y 

640 NEXT Y 

650 FOR X=1 TO 20 'LINES 650-690: INCREMENT RANK-COUNTERS FOR CURRENT 
SUB-TOURNAMENT 

660 FOR Y=1 TO Z 

670 IF PTF(2,Y)=TOU(X,Z+1) THEN TOU(X,OAN(Y) )=TOU(X,OAN(Y) )+1 

680 NEXT Y 

690 NEXT X 

695 REM: LINES 700-734: RETURN TO APPROPRIATE MERGED FILE, SELECT 
COMPETITORS FOR NEXT SUB-TOURNAMENT 

700 IF Z=2 THEN RETURN 2060 

702 IF Z=3 THEN RETURN 3070 

704 IF Z=4 THEN RETURN 4080 

706 IF Z=5 THEN RETURN 5090 

708 IF Z=6 THEN RETURN 6100 

710 IF Z=7 THEN RETURN 7110 

712 IF Z=8 THEN RETURN 8120 

714 IF Z=9 THEN RETURN 9130 
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716 IF Z=10 THEN RETURN 10140 

718 IF Z=11 THEN RETURN 11150 

720 IF Z=12 THEN RETURN 12160 

722 IF Z=13 THEN RETURN 13170 

724 IF Z=14 THEN RETURN 14180 

726 IF Z=15 THEN RETURN 15190 

728 IF Z=16 THEN RETURN 16200 

730 IF Z=17 THEN RETURN 17210 

732 IF Z=18 THEN RETURN 18220 

734 IF Z=19 THEN RETURN 19120 

750 PRINT "INTERACTION TABLE FOR THE"W"'SUB-TOURNAMENTS OF (20 CHOOSE 
"Z") COMBINATIONS” :PRINT ‘OUTPUT TABLE HEADING 

760 GOSUB 910 ‘COMPUTE EACH STRATEGY'S COMBINATORIC EFFICIENCY 

770 PRINT"RULE § SEL%"SPC(2); 'PRINT COLUMN HEADINGS (ACRONYM, % 
APPEARANCES) 

780 FOR J=1 TOZ ‘LINES 780-800: PRINT COLUMN HEADINGS (RANK 
NUMBERS) 

790 PRINT USING "HH ";J, 

800 NEXT J | 

810 PRINT"EFF%":PRINT ‘PRINT EFFICIENCY COLUMN HEADING 

820 FOR J=1 TO 20 ‘LINES 820-890: OUTPUT COLUMN DATA FOR EACH STRA- 
TEGY 

830 PRINT NA$(J)SPC(3); 'PRINT ACRONYM 

840 PRINT USING "ikt ";TOU(J,0)*100/W, ‘PRINT % OF APPEARANCES 

850 FOR K=1 TO Z ‘LINES 850-870: PRINT RATIO OF EACH RANKING TO 
TOTAL APPEARANCES, FROM 1ST TO ZIH 

860 PRINT USING ".4H ";TOU(J,K)/TOU(J,0), 

870 NEXT K 

880 PRINT USING " 4##.4";PER(J) ‘PRINT EFFICIENCY 

890 NEXT J 

900 END 

910 FOR J=1 TO 20 

920 FOR K=1 TO Z-1 

930 SUM=SUM+(Z-K) *TOU(J,K) 

940 NEXT K 

950 PER(J)=SUM*100/ ((Z-1) *TOU(J,0)) 

960 SUM=0 

970 NEXT J 

980 RETURN 770 

990 REM: LINES 2000-2999 CONSTITUTE MERGED PROGRAM Z2, WHICH SELECTS 
ALL POSSIBLE COMBINATIONS OF 2 STRATEGIES. 

2000 W=0 'W COUNTS THE NUMBER OF COMBINATIONS 

2005 REM: LINES 2010-2030: SELECTS COMBINATIONS OF TWO STRATEGIES, 
ONE COMBINATION AT A TIME. ENUMERATED STRATEGIES ARE PASSED TO GAM 
ARRAY. 

2010 FOR A=1 TO 19 

2020 FOR B=A+1 TO 20 

2030 GAM(1)=A: GAM(2)=B 

2040 W=W+1 

2050 GOSUB 320 'RETURNS TO MAIN PROGRAM WITH SELECTED COMBINATION 
2060 NEXT B: NEXT A 'MAIN PROGRAM REQUESTS NEXT COMBINATION 

2999 GOTO 750 'OUTPUT DATA AFTER ALL COMBINATIONS ARE EXHAUSTED 
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Program A4.12 — ECOSYST 


5 REM: ECOSYST GENERATES THE ECOSYSTEMIC COMPETITION FOR 20 STRATE- 
GIES. 

1 O D A T A 
1000, 1952, 2992, 3996, 5000, 1004, 1008, 1004, 1176 , 4996, 1212 , 1024, 1272 , 1380 
, 3664, 3292,1040, 1004, 1004, 1004 

1 5 D A T A 
762,1727 , 2634, 3550 , 4470, 1673, 2324, 1520, 948, 3580 , 920 , 777 , 1025 , 1098 , 448 
4, 3023, 2426, 2405 , 748,1693 

2 O D A T A 
502, 1354, 2243, 3139 , 3972, 2193, 3129, 2098, 713, 2299 , 659 , 529 , 743,825, 3968, 
2681 , 3200, 3076 ,523, 2161 

2 5 OD A T A 
251,1095,1914, 2685, 3472, 2706, 3295, 2436 .537,1124,405, 307,479,570, 3460, 
2086 , 3476 , 3280, 248, 2721 

3 O D A T A 
0,795, 1542, 2292, 3000 , 3000, 3000, 2700, 3000 , 0 , 120, 30, 243, 264, 3000, 3000, 3 
000, 3000, 3000, 1500 

3 5 D A T A 
999, 1673, 2193, 2701, 3000, 3000, 3000, 1036 , 3000, 2250,1108,1019, 1267, 2965, 
3000, 3000, 3000, 3000, 3000, 2999 

4 O D A T A 
998, 1444, 1874, 2365, 3000 , 3000, 3000, 2662, 3000, 1800, 1143, 1002,1204, 2935, 
3000 , 3000, 3000, 3000, 3000, 1500 

4 5 D A T A 
999, 1690, 2433, 2766, 3200, 1041 , 3197 , 1033, 1174, 2367,1148, 1018, 2989, 3140, 
3242 , 2735 , 3215, 3225,1027,1049 

5 0 D A T A 
956 , 1878, 2913, 3877, 3000, 3000, 3000 , 974, 3000 , 4529, 1125, 1008, 1263, 1322,3 
000, 3000, 3000, 3000, 3000, 2999 

5 5 D A T A 
1,1055, 2219, 3534,5000, 2250, 2800, 2132, 384, 2000, 230,42, 368, 466 , 4984, 326 
6,2837,2851,6,2251 

6 O D A T A 
947 , 1995, 2899, 3940, 4920 , 1113, 1538, 1133, 1180,4750, 2384, 1003, 2396, 1887, 
4875, 3241, 2610, 2294, 955,1175 

6 5 D A T A 
994, 2037, 3004, 3912, 4980, 1024,1147,1013,1168,4972,1181,1029, 1266, 1332, 
4940 , 3273,1193,1243,996,1013 

7 O D A T A 
932,1970, 2903, 3899 , 4838, 1272, 1624, 2544,1183, 4628, 2356, 987, 2594, 2123,4 
852 , 3232, 2870 , 3000, 939, 1312 

7 5 D A T A 
905, 1878, 2900, 3750 , 4824, 2965, 2995, 2665, 1237 , 4556, 1741 , 971 , 1849, 1807,4 
814, 3270, 3000 , 2893, 955, 2926 

8 O D A T A 
334, 764, 1543, 2305, 3000, 3000, 3000, 2637 , 3000 , 14, 165,45 , 217,279, 3000, 300 
0, 3000, 3000, 3000, 2500 

8 5 OD A T A 
427 ,1233, 2111, 2721 , 3000, 3000, 3000, 2385, 3000, 1156 , 586, 468,672,670, 3000 
, 3000 , 3000 , 3000, 3000, 2995 

9 O D A T A 
990 , 1541, 2010, 2281, 3000, 3000, 3000, 2660, 3000, 1697, 2140, 988, 2215, 2210,3 
000, 3000, 3000, 3000, 3000, 2987 
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9 5 OD A T A 
999, 1475, ,1875,, 2445, 3000, 3000, 3000, 2640 , 3000, 1681 , 1614, 988, 2263, 2505,3 
000, 3000, 3000, 3000, 3000, 2999 

1 0 0 D A T A 
999, 2023, 2933,4033, 3000, 3000, 3000, 1027, 3000, 4991, 1195, 1031, 1259, 1325, 
3000, 3000 , 3000 , 3000, 3000, 1007 

1 0 5 D A T A 
999, 1688 , 2156, 2731 , 4000 , 2999, 4000, 1044, 2999, 2246 , 1175, 1008, 1307 , 2951, 
2500 , 2995, 3007 , 2999, 1002, 2998 

1 0 7 D A T A 
DDD, TQD, RAN, TOC, CCC, TFT, TIT, BBE, SHU, TAT, MEU, MAD, MAE, MAC, NYD, GRO, CHA, E 
TH, FRI, TES 

110 DIM NA$(20) "ARRAY OF ACRONYMS 

120 DIM ECO(20,2) "ARRAY OF RELATIVE POPULATIONS, CURRENT AND 
PREVIOUS GENERATIONS 

130 DIM ALL(20,20,2) ‘ARRAY OF RAW SCORES, CURRENT AND PREVIOUS 
GENERATIONS 

140 DIM ROW(20) ‘ARRAY OF TOTAL SCORES FOR EACH STRATHGY 

160 FOR I=1 TO 20 . "LINES 160-190: READ RAW TOURNAMENT SCORES 
(FIRST GENERATION) 

170 FOR J=1 TO 20 

180 READ ALL(I,J,1) 

190 NEXT J: NEXT I 

210 FOR I=1 TO 20 ‘LINES 210-280: COMPUTE EACH STRATEGY'S TOTAL 
SCORE (FIRST GENERATION) 

220 FOR J=1 TO 20 

230 SUM=SUM+ALL(I,J,1) 

240 NEXT J 

250 ROW (I)=SUM 

260 TOT=TOT+ROMW (I) "TOT IS THE TOTAL OF TOTAL SCORES, FOR FIRST 
GENERATION 

270 SUM=0 

280 NEXT I 

290 LPRINT "FRACTIONAL DISTRIBUTIONS, PER 1000 OF POPULATION" :LPRINT 
‘PRINT TABLE HEADING 

300 FOR J=1 TO 20 "LINES 300-320: PRINT STRATEGY NAMES AS COLUMN 
HEADINGS 

310 READ NA$(J) 

315 LPRINT NA$(J)SPC(1) ; 

320 NEXT J 

325 PRINT 

440 FOR Q=1 TO 20 ‘LINES 440-470: COMPUTE & PRINT EACH STRATEGY'S 
RELATIVE POPULATION FOR 15T GENERATION 

450 ECO(Q,1)=ROW(Q) /TOT 

460 LPRINT USING “HH ";1000*ECO(Q,1); 


470 NEXT Q 

473 TOT=0 

475 N=2 'SETS NEXT GENERATION = 2 

480 WHILE W3 'STOP AFTER THREE CONSECUTIVE IDENTICAL GENERATIONS 


OF POPULATION FREQUENCIES (THIS NEVER OCCURRED) 

490 FOR I=1 TO 20 ‘LINES 490-530: COMPUTE SCORES FOR CURRENT GENERA- 
TION 

900 FOR J=1 TO 20 

510 ALL(I,J,2)=ALL(I,J,1) * (ROW(T) / (ROW CD +ROW (J) ) ) 

520 NEXT J 
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530 NEXT I | 

570 FOR P=1 TO 20 ‘LINES 570-640: COMPUTE EACH STRATHGY'S TOTAL 
SCORE FOR CURRENT GENERATION 

580 FOR Q=1 TO 20 

590 SUM=SUM+ALL(P.;Q,2) 

600 NEXT Q - >~ 

610 ROW (P)=SUM 

620 TOT=TOT+ROW(P) ‘COMPUTES TOTAL OF TOTAL SCORES FOR CURRENT 
GENERATION 

630 SUM=0 

640 NEXT P 

650 FOR P=1 TO 20 ‘LINES 650-680: COMPUTE & PRINT EACH STRATEGY'S 
RELATIVE POPULATION FREQUENCY FOR CURRENT GENERATION 

660 ECO(P,2)=ROW(P) /TOT 

670 LPRINT USING "it ";1000*ECO(P, 2) ; 

680 NEXT P 

685 TOT=0 

730 M=0 ‘LINES 730-770: CHECK WHETHER CURRENT GENERATION'S FRE- 
QUENCIES ARE IDENTICAL TO PREVIOUS GENERATION'S FREQUENCIES 

740 FOR J=1 TO 20 

750 IF ECO(J,2)=ECO(J,1) THEN M=M+1 

760 NEXT J 

770 IF M=20 THEN W=W+1 

780 FOR J=1 TO 20 'SETS PREVIOUS GENERATION'S ARRAYS TO THOSE OF 
CURRENT GENERATION 

790 ECO(J,1)=HCO(J,2) 

800 FOR K=1 TO 20 

810 ALL(J,K,1)=ALL(J,K,2) 

820 NEXT K: NEXT J 

830 N=N+1 ‘INCREMENTS GENERATION 

840 IF N MOD 50=0 THEN GOSUB 880 ‘REPRINTS COLUMN HEADINGS EVERY 50 
GENERATIONS 

850 WEND ‘PROCESS NEXT GENERATION 

860 LPRINT "STABILITY ATTAINED AFTER"N-W-1"GENERATIONS" 

870 END 

880 LPRINT:LPRINT N"GENERATIONS":LPRINT 

890 FOR J=1 TO 20 

900 LPRINT NA$(J)SPC(1) ; 

910 NEXT J 

920 LPRINT 

930 RETURN 850 
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Program A4.13 — TESTMAT 


90 REM: TESTMAT GENERATES A GAME BETWEEN TWO MAXIMIZATION STRATEGIES, 
MX1 AND MX2. INPUT IS A 100-MOVE EVENT MATRIX. TESTMAT OUTPUTS 
RESULTING EVENT MATRICES AND SCORES AT 100-MOVE INTERVALS 

100 INPUT "INITIAL C,D,E,F VALUES";C,D,E,F 

110 DIM MX1(100) 'ARRAY OF MX1's FIRST 100 MOVES 

120 DIM MX2(100) ‘ARRAY OF MX2's FIRST 100 MOVES 

130 FOR L=1 TO 10 ‘CONDUCT GAME IN 10 BLOCKS OF 100 MOVES 

140 IF L=1 THEN S1=(3*C+5*E+F) : 52=(3*C+5*D+F): GOTO 210 ‘COMPUTES 
INITIAL SCORE (AFTER 100 MOVES) FROM INPUT 

150 FOR M=1 TO 100 ‘START NEXT BLOCK OF MOVES 

160 IF (L>1) AND (M=1) THEN GOTO 260 

165 REM: LINES 170-200: UPDATE EVENT MATRIX 

170 IF MX1(M-1)=1 AND MX2(M-1)=1 THEN C=C+1 

180 IF MX1(M-1)=1 AND MX2(M-1)=0 THEN D=D+1 

190 IF MX1(M-1)=0 AND MX2(M-1)=1 THEN E=E+1 

200 IF MX1(M-1)=0 AND MX2(M-1)=0 THEN F=F+1 

205 REM: LINES 210-240: FIND EXPECTED UTILITIES 

210 IF (C<>0) OR (IX>0) THEN U1C = 3*C/ (C+D) 

220 IF (E<20) OR (F<X>0) THEN U1D = (S*E+F)/(Et+F) 

230 IF (C<>0) OR (EX>0) THEN U2C = 3*C/ (C+E) 

240 IF (D<20) OR (F<>0) THEN UZD = (5*D+F)/(D+F) 

250 IF L=1 THEN GOTO 340 

260 IF U1C >=U1D THEN MX1(M)=1 ELSE MX1(M)=0 'MX1's DECISION RULE 
270 IF U2C >=U2D THEN MX2(M)=1 ELSE MX2(M)=0 'MX2's DECISION RULE 
275 REM: LINES 280-310: UPDATE SCORES 

280 IF MX1(M)=1 AND MX2(M)=1 THEN 51=51+3: 52=52+3 

290 IF MX1(M)=1 AND MX2(M)=0 THEN 52=52+5 

300 IF MX1(M)=0 AND MX2(M)=1 THEN 51=51+5 

310 IF MX1(M)=0 AND MX2(M)=0 THEN 51=51+1: 52=52+1 

320 IF L>1 AND M=100 THEN GOSUB 440 

330 NEXT M ‘MAKE NEXT MOVE 
335 REM: LINES 340-410: OUTPUT EXPECTED UTILITIES, EVENT MATRICES, 
AND SCORES 

340 PRINT C,D,U1C 

350 PRINT E,F,U1D 

360 PRINT C+D+E+F; "MOVES" 

370 PRINT C,E,U2C 

380 PRINT D,F,U2D 

390 PRINT "MX1's SCORE IS";S1 

400 PRINT "MX2's SCORE IS";S2 

410 PRINT 

420 NEXT L 'START NEW BLOCK OF 100 MOVES 

430 END 

435 REM: LINES 440-500: UPDATE EVENT MATRIX AND SCORES ON 100TH MOVE 
OF EACH BLOCK 

440 IF MX1(100)=1 AND MX2(100)=1 THEN C=C+1 

450 IF MX1(100)=1 AND MX2(100)=0 THEN D=D+i 

460 IF MX1(100)=0 AND MX2(100)=1 THEN F=E+1 

470 IF MX1(100)=0 AND MX2(100)=0 THEN F=F+1 

480 U1C = 3*C/ (C+D): VID = (S*E+F)/(E+F) 

500 U2C = 3*C/(C+E): U2D = (5*D+F)/(D+F) 

520 RETURN 
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Program A4.14 — MAXVMAX 


90 REM: MAXvMAX CONDUCTS N GAMES BETWEEN TWO MAXIMIZATION FAMILY 
MEMBERS (MX1 & MX2), AND OUTPUTS THEIR AVERAGE SCORES AND A HISTOGRAM 
OF MX1's SCORE DISTRIBUTION. 

100 INPUT "ENTER THE NUMBER OF GAMES TO BE RUN":N 


110 INPUT “THE VALUE OF RND(J)";L ‘SET INITIAL CO-OPERATIVE 
WEIGHTING FOR MX1 
120 INPUT "THE VALUE OF RND(J+1)";M ‘SET INITIAL CO-OPERATIVE 


WEIGHTING FOR MX2 

140 DEFINT A-F,I-K,P ‘INTHGER VARIABLES DEFINED FOR MORE RAPID 
EXECUTION 

150 DIM S1(100) "ARRAY OF MX1's SCORES 

160 DIM 52(100) "ARRAY OF MX2's SCORES 

170 DIM MX1(1001) ‘ARRAY OF MX1's MOVES 

180 DIM MX2(1001) ‘ARRAY OF MX2's MOVES 

190 DEF FNSEL(X)=INT(X*.01) ‘DEFINES FUNCTION THAT ALLOCATES EACH 
GAME SCORE TO APPROPRIATE HISTOGRAM CHANNEL 

200 DIM HIS(29) ‘ARRAY OF 29 HISTOGRAM CHANNELS 

210 FOR K=1 TO N ‘PLAY FIRST GAME 

220 RANDOMIZE TIMER | 

230 FOR J=1 TO 100 "LINES 230-260: MX1 & MX2 MAKE THEIR 100 RANDOM 
MOVES 

240 IF RND(J)< L THEN MX1(J)=0 ELSE MX1(J)=1 

250 IF RND(J+1)< M THEN MX2(J)=0 ELSE MX2(J)=1 

260 NEXT J 

270 FOR I=1 TO 1000 'LINES 280-320: EVENT MATRIX AND SCORES ARE 
UPDATED 

280 IF MX1(I)=1 AND MX2(1)=1 THEN C=C+1: A=A+3: B=B+3 

290 IF MX1(I)=1 AND MX2(I)=0 THEN D=D+1: B=B+5 

300 IF MX1(I)=0 AND MX2(I)=1 THEN E=E+l: A=A+5 

310 IF MX1(I)=0 AND MX2(I)=0 THEN F=F+1: A=A+1: B=B+1 

320 IF I>100 THEN IF D<>0 AND E<20 THEN U1C=3*C/(C+D) : U2C=3*C/ (C+E) 
‘MX1 & MX2 FIND THEIR EXPECTED UTILITIES OF CO-OPERATION 

330 IF I>100 THEN IF Ix>0 AND EO THEN U1D=(S*EHF)/(E+F): U2D= 
(5*D+F)/(D+F) 'MX1 & MX2 FIND THEIR EXPECTED UTILITIES OF DEFECTION 
340 IF I>=100 THEN IF U1C»U1D THEN MX1(I+1)=1 ELSE MXi(I+1)=0 
'MX1's DECISION RULE 

350 IF 1>100 THEN IF U2C>=U2D THEN MX2(I+1)=1 ELSE MX2(I+1)=0 'MX2's 
DECISION RULE 

360 NEXT I 

370 SUM1=SUM1+A: SUM2=SUM2+B 'SCORES SUMMED FOR LATER AVERAGING 

380 S1(K)=A: S2(K)=B 'GAME SCORES ENTERED IN SCORE ARRAYS 

390 Q=FNSEL (A) 'MX1's SCORE IS ALLOCATED TO ITS APPROPRIATE HIS- 
TOGRAM CHANNEL 

400 HIS(Q)=HIS(Q)+1 ‘THAT HISTOGRAM CHANNEL'S CONTENTS ARE INCRE- 
MENTED 


410 A=0:B=0:C=0:D=0:E=0:F=0 ‘RESET SCORE AND OUTCOME COUNTERS 

420 NEXT K ' NEXT GAME 

430 PRINT "AFTER" K-1 "GAMES, THE MEAN SCORE IS 
"SUM1/ (K-1) "—"SUM2/ (K-1) ‘PRINT MEAN SCORES 

440 FOR P=10 TO 29 ‘LINES 440-460: OUTPUT HISTOGRAM 

450 PRINT "RANGE"P*100'"—"P*100+99; "FREQUENCY "HIS(P) 

460 NEXT P 
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Program A4.15 — MAXrMAX 


90 REM: MAXrMAX ACCEPTS INITIAL "W" AND "Z" VALUES, AND A RANGE OF 
DIFFERENCE BETWEEN "X" AND "Y", FOR THE 100-MOVE EVENT MATRIX. 
MAXrMAX OUTPUTS FINAL GAME SCORES AND THE NUMBER OF MOVES REQUIRED 
FOR PERPETUAL MUTUAL CO-OPERATION TO COMMENCE. 

100 INPUT "W VALUE IS";W ‘INITIAL INSTANCES OF (C,c) 

110 INPUT "Z VALUE IS";Z ‘INITIAL INSTANCES OF (D,d) 

120 IF (W+Z) MOD 2 = 0 THEN GOSUB 410 ‘PROMPTS INPUT OF EVEN RANGE 
IF WZ IS EVEN 

130 IF (W+Z) MOD 2 = 1 THEN GOSUB 440 'PROMPTS INPUT OF ODD RANGE 
IF WZ IS ODD 

140 A=100-W-Z 'LINES 140-160: FINDS INITIAL VALUES OF X AND Y 

150 IF A MOD 2=0 THEN X=A/2+1 ELSE X=INT(A/2)+1 

160 IF A MOD 2=0 THEN Y=A/2-1 ELSE Y=INT(A/2) 

170 WHILE R+2-(X-Y) 'GENERATES FIRST GAME WITHIN RANGE 

180 C=W: D=X: E=Y: F=Z 

190 LPRINT W;X;Y;Z; ‘OUTPUTS INITIAL EVENT MATRIX 

200 S1=3*W+5*Y+Z: 52=3*W+5*X+Z 'COMPUTES INITIAL SCORES 

210 EUC1= 3*W/(W+X): EUD1= (5*Y+Z)/(Y+Z) ‘FINDS EXPECTED UTILITIES 
FOR 1015T MOVE 

220 EUC2= 3*W/(W+Y): EUD2= (5*X+Z)/(X+Z) ‘FINDS EXPECTED UTILITIES 
FOR 101ST MOVE 

225 REM: LINES 230-310: UPDATE EVENT MATRIX, SCORES AND EXPECTED 
UTILITIES FOR REMAINDER OF GAME 

230 FOR J=101 TO 1000 | 

240 IF EUC1>»=EUD1 AND EUC2>=EUD2 THEN W=W+1: S1=S1+3: 52=52+3: 

250 IF EUC1>=BUD1 AND EUC2>=EUD2 AND FLAG=0 THEN GOSUB 380 # 'NOTE 
MOVE ON WHICH PERPETUAL MUTUAL CO-OPERATION COMMENCES 

260 IF HUC1>=EUD1 AND EUC2<EUD2 THEN X=X+1: 52=52+5 

270 IF EUC1<EUD1 AND EUC2>=EUD2 THEN Y=Y+1: 51=51+5 

280 IF EUC1<EUD1 AND EUC2<EUD2 THEN Z=Z+1: S1=S1+1: 52=52+1 

290 EUC1= 3*W/(W+X): EUD1= (5*Y+Z)/(Y+Z) 

300 EUC2= 3*W/(W+Y): EUD2= (5*X+Z)/(X+Z) 

310 NEXT J 

320 IF FLAG=0 THEN LPRINT "no (C,c) occurred "; 

330 LPRINT 51;52 'PRINT SCORES FOR THIS GAME 

340 W=C: X=D+1: Y=E-1: Z=F ‘INITIALIZE EVENT MATRIX FOR NEXT GAME 
(INCREMENT X, DECREMENT Y) 

350 81=0: S2=0: HJC1=0: HUD1=0: HJC2=0: HUD2=0: FLAG=0 ‘RESET 
COUNTERS 


360 WEND 'PLAY NEXT GAME, IF IN RANGE 
370 END 

380 FLAG=1 

390 LPRINT "(C,c) on move"J; 
400 RETURN 310 

410 P=0 

420 INPUT "EVEN RANGE IS";R 
430 RETURN 140 

440 P=1 

450 INPUT "ODD RANGE IS":R 
460 RETURN 140 
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